Skip to content

TOGoS/TSoft40

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

TOGSOFT-40: The TOGoS Software System

Specifications for data formats, RDF concepts, and such.

Gernal approach

(untested advice to self as of 2024-10-19):

Common data formats

Turtle? Though ContentCouch uses RDF+XML. But XML is a pain to deal with.

Process concepts

InputType ->{effects} OutputType can be used as a lowest-common-denominator to describe everything else.

  • Functions with side-effects
  • Pure functions
    • A special case of function-with-side-effects whose list of possible effects is empty
    • Single argument, single return value
    • Multiple, possibly named arguments
    • Multiple, possibly named return values (think continuation passing style; returning and calling are the same kind of thing)
  • Stack operations
    • Which can be thought of as functions that take a stack and return a stack, with or without side-effects
  • Stream processes
    • ProtoProcess = () ->{Read,Write,...} Int32
    • Those that eat bytes, emit bytes, and finally exit with some error code
      • Like an OS process, but without side-effects
    • Those that do all that but can also have other effects
      • Basically like an OS process
  • Commands
    • In the spirit of shell or Tcl commands
    • List String -> Map String String -> ProtoProcess, i.e. a function that takes a list of arguments (argv) and environment variable values and returns some representation of a a process that can be executed any number of times.

Choice of programming language

Former favorite language for getting things done: Ruby.

More recently, TypeScript, via Deno.

I think that Unison is a good idea, though I find it cumbersome to program in, but its notation for indicating effects is useful for formalizing ideas outside of Unison.

Factor seems very close to the idea of a Forthlike languages with high-level constructs that I keep trying to build.

As of 2024-10-19 I am on a Raku (formerly 'Perl 6') kick. If nothing else, Perl is interesting.

Where to put 'scratch' projects

Make a new sub-project of Scratch38. Start here.

TScript34 has a lot of random sub-projects, too.

If something outgrows TScript34, it could get its own TOG Software Project number.

This is too ambiguous. I might standardize on 'everything is a TOG Software Project', but using high numbers with lots of digits for probably-throwaway things.

Projbase Metadata

An RDF+XML file called ".projbase.rdf.xml" in a project directory can hold some metadata about the project. The root element of that file should describe a http://ns.nuke24.net/X-2024/Project/Projbase, and might include attributes like http://ns.nuke24.net/X-2024/Project/projbaseName, http://purl.org/dc/terms/title, and http://purl.org/dc/terms/description

See https://www.nuke24.net/docs/ns/namespaces.tef for more terms.

Naming Things

Generate UUIDs or OIDs for new concepts rather than coming up with a 'nice' name right away. They can always be aliased together later.

Can use names like http://ns.nuke24.net/X-2024/Whatever as placeholders.

Backups

Want to back up a bunch of data from friends/family/yourself?

First of all, don't spend too much time trying to curate what gets archived or not. This will result in failing to actually do the backups.

Do use a content-addressing datastore for storing files. This way you can back up a file any number of times without using up additional disk space.

Do buy some large hard drives and maybe some M-discs and a writer.

Procedure:

  • Give the original media/partition a name.
    • e.g. WSITEM-3334.5 is a partition on WSITEM-3334, an SSD I bought in 2024.
    • Set the partition label to that name, or some shorthand, and possibly add a .partition-info.txt or something in the root to help identify it or provide other information 10-years-from-now you might wwant to know.
  • Generate a manifest of the files using ccouch3 walk-fs.
    • I like to store these on the partition as /manifests/<name>-<subdir>-<date>.manifest, e.g. WSITEM-202003.1-DCIM-20260207T1336.manifest.
    • Feel free to only include certain files/directories, but it's also fine to go ahead and include everything, including earlier manifests
  • Store the manifests in CCouch.
    • I use the file-manifests sector for this.
  • Reference manifests in a central location.
    • I have a FileManifests Git repo, but these files can be so large that they cause problems for Git. It might be better to store them in CCouch and just reference them by hash from a central Git repo. i.e. I don't yet have a specific way to do this.
  • Back up individual files to CCouch.
    • Don't worry about cleaning up the directory structure, first
    • If it's too much trouble, don't even worry about storing the directory structure; you've already recorded exactly what was on this in the manifest.
    • todo: improve CCouch tooling to be able to, like, weak-ignore some files; reference them but don't copy.
  • Write collections of files to M-disc or other ROM.
    • If not too inconvenient, include the manifest (under /manifests/), the CCouch tree or directory URI (in /.ccouch-uris or /.commit-uris, respectively), any CCouch directory data (under /.ccouch/data/).
    • Feel free to include other manifests and other data in those directories, too. An updated manifest for this partition should include all of it.
  • Store the manifest for your M-discs in your central manifest repository, also.

Then, later on, you can:

  • Write a tool to index all the manifests and CCouch directory data and help you find out what files are available where
    • As of 2026-02-10, I have been meaning to build this tool for a few years, but have not yet done so. Presumably it will tie into TOGETL somehow.
  • In lieu of nice tool, grep through manifests.

TODO: how to back up Git repos or similar structures that do their own content addressing/deduplication?

Terminology

  • project (task) :: a set of tasks to be completed
  • projbase, project (codebase) :: a versioned set of files, usually stored in one or more Git repositories
  • TEF :: TOGoS's Entry Format, a relatively human and machine-readable/writable format for storing arbitrary information in a loosely-structured way
  • TSV :: Tab-separated values; I have my own dialect that I follow, which is itself a line-oriented 'hash-format'

Links

RDF concepts

  • namespaces.tef, which should probably be moved into this repository.
  • TOGVM
    • RDF types and predicates for describing functional programs

TODO: Consolidate all of them!

File formats

I try to stick to a relatively consistent convention for text-based formats, which is that # can be used to indicate line comments or half-out-of-band data. If followed by whitespace, it's a comment. If followed by a word, it may have special meaning. #! is also treated as comment to allow for shebang lines.

XML datatypes

i.e. lexical <-> value encodings

Libraries

URI schemes

  • active:
    • For representing function applications as URIs
  • urn:bitprint:
  • x-git-object:
    • For referencing blobs, directories, and commits by their Git hash (which is based on, but isn't exactly, SHA-1)
  • x-rdf-subject:
    • Prefix to indicate the concept described by the document identified by the rest of the URI.
  • urn:oid:
    • Identify things by OID

Etc

About

The TOGoS software system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published