Specifications for data formats, RDF concepts, and such.
(untested advice to self as of 2024-10-19):
Turtle? Though ContentCouch uses RDF+XML. But XML is a pain to deal with.
InputType ->{effects} OutputType can be used as a
lowest-common-denominator to describe everything else.
- Functions with side-effects
- Pure functions
- A special case of function-with-side-effects whose list of possible effects is empty
- Single argument, single return value
- Multiple, possibly named arguments
- Multiple, possibly named return values (think continuation passing style; returning and calling are the same kind of thing)
- Stack operations
- Which can be thought of as functions that take a stack and return a stack, with or without side-effects
- Stream processes
ProtoProcess = () ->{Read,Write,...} Int32- Those that eat bytes, emit bytes, and finally exit with some error code
- Like an OS process, but without side-effects
- Those that do all that but can also have other effects
- Basically like an OS process
- Commands
- In the spirit of shell or Tcl commands
List String -> Map String String -> ProtoProcess, i.e. a function that takes a list of arguments (argv) and environment variable values and returns some representation of a a process that can be executed any number of times.
Former favorite language for getting things done: Ruby.
More recently, TypeScript, via Deno.
I think that Unison is a good idea, though I find it cumbersome to program in, but its notation for indicating effects is useful for formalizing ideas outside of Unison.
Factor seems very close to the idea of a Forthlike languages with high-level constructs that I keep trying to build.
As of 2024-10-19 I am on a Raku (formerly 'Perl 6') kick. If nothing else, Perl is interesting.
Make a new sub-project of Scratch38. Start here.
TScript34 has a lot of random sub-projects, too.
If something outgrows TScript34, it could get its own TOG Software Project number.
This is too ambiguous. I might standardize on 'everything is a TOG Software Project', but using high numbers with lots of digits for probably-throwaway things.
An RDF+XML file called ".projbase.rdf.xml" in a project directory
can hold some metadata about the project. The root element
of that file should describe a http://ns.nuke24.net/X-2024/Project/Projbase,
and might include attributes like http://ns.nuke24.net/X-2024/Project/projbaseName,
http://purl.org/dc/terms/title, and http://purl.org/dc/terms/description
See https://www.nuke24.net/docs/ns/namespaces.tef for more terms.
Generate UUIDs or OIDs for new concepts rather than coming up with a 'nice' name right away. They can always be aliased together later.
Can use names like http://ns.nuke24.net/X-2024/Whatever as placeholders.
Want to back up a bunch of data from friends/family/yourself?
First of all, don't spend too much time trying to curate what gets archived or not. This will result in failing to actually do the backups.
Do use a content-addressing datastore for storing files. This way you can back up a file any number of times without using up additional disk space.
Do buy some large hard drives and maybe some M-discs and a writer.
Procedure:
- Give the original media/partition a name.
- e.g. WSITEM-3334.5 is a partition on WSITEM-3334, an SSD I bought in 2024.
- Set the partition label to that name, or some shorthand, and possibly add a .partition-info.txt or something in the root to help identify it or provide other information 10-years-from-now you might wwant to know.
- Generate a manifest of the files using
ccouch3 walk-fs.- I like to store these on the partition as
/manifests/<name>-<subdir>-<date>.manifest, e.g.WSITEM-202003.1-DCIM-20260207T1336.manifest. - Feel free to only include certain files/directories, but it's also fine to go ahead and include everything, including earlier manifests
- I like to store these on the partition as
- Store the manifests in CCouch.
- I use the
file-manifestssector for this.
- I use the
- Reference manifests in a central location.
- I have a
FileManifestsGit repo, but these files can be so large that they cause problems for Git. It might be better to store them in CCouch and just reference them by hash from a central Git repo. i.e. I don't yet have a specific way to do this.
- I have a
- Back up individual files to CCouch.
- Don't worry about cleaning up the directory structure, first
- If it's too much trouble, don't even worry about storing the directory structure; you've already recorded exactly what was on this in the manifest.
- todo: improve CCouch tooling to be able to, like, weak-ignore some files; reference them but don't copy.
- Write collections of files to M-disc or other ROM.
- If not too inconvenient, include the manifest (under
/manifests/), the CCouch tree or directory URI (in/.ccouch-urisor/.commit-uris, respectively), any CCouch directory data (under/.ccouch/data/). - Feel free to include other manifests and other data in those directories, too. An updated manifest for this partition should include all of it.
- If not too inconvenient, include the manifest (under
- Store the manifest for your M-discs in your central manifest repository, also.
Then, later on, you can:
- Write a tool to index all the manifests and CCouch directory data
and help you find out what files are available where
- As of 2026-02-10, I have been meaning to build this tool for a few years, but have not yet done so. Presumably it will tie into TOGETL somehow.
- In lieu of nice tool, grep through manifests.
TODO: how to back up Git repos or similar structures that do their own content addressing/deduplication?
- project (task) :: a set of tasks to be completed
- projbase, project (codebase) :: a versioned set of files, usually stored in one or more Git repositories
- TEF :: TOGoS's Entry Format, a relatively human and machine-readable/writable format for storing arbitrary information in a loosely-structured way
- TSV :: Tab-separated values; I have my own dialect that I follow, which is itself a line-oriented 'hash-format'
- namespaces.tef, which should probably be moved into this repository.
- TOGVM
- RDF types and predicates for describing functional programs
TODO: Consolidate all of them!
- M3U Extensions
- TOGoS Binary Blocks and TOGoS Text Blocks
- TSVFileManifestV1
- one
#format('hash-format') format
- one
I try to stick to a relatively consistent convention
for text-based formats, which is that # can be used to indicate
line comments or half-out-of-band data. If followed by whitespace,
it's a comment. If followed by a word, it may have special meaning.
#! is also treated as comment to allow for shebang lines.
i.e. lexical <-> value encodings
- https://www.nuke24.net/docs/2023/SubjectDatatype.html
- TS34Encoded Datatype
- Sort of a flexible meta-datatype that can be used to indicate a series of encodings
- active:
- For representing function applications as URIs
- urn:bitprint:
- x-git-object:
- For referencing blobs, directories, and commits by their Git hash (which is based on, but isn't exactly, SHA-1)
- x-rdf-subject:
- Prefix to indicate the concept described by the document identified by the rest of the URI.
- urn:oid:
- Identify things by OID