Aim

(download source at https://github.com/emilbarton/Unidatab)

Unidatab is an amateur’s attempt to provide a universal database mechanism. It is coded in SQL/C++ with an administrator tool and remote server and client.

Universality should be understood here as having a varying number of tables – called subds – and a varying number of columns in each record of each subdb. This design has a cost in terms of speed, lightness and relationality.

Unidatab is for processes and persons who want to investigate small size databases without having to ask if, where and how each time. Data are entered as records that may pertain to one and only one subdatabase each. Any record can see its number of columns changed at any time independantly of its possible subsumption under a subdb. Records are identified by a unique id number but can receive also a useful mnemotechnic alias.

Columns are realized as type-property pairs, also known as symbols. Each type represents a column name, and the corresponding property its value. Records can obtain a definable mark when responding to a specific content request. The records showing this special symbol are then available for bulk actions. A record can receive as many symbols and marks as desired. Recursivity is a side effect of this design: as records only contain node pointers, two records can have each other as property and a single record can be a property of itself. Any nominalist use of data structures shall be stimulated by this feature which allows to handle XML-like data trees in SQL context. Possible loops are avoided at print time by ensuring that the content of a record is only output once, additional mentions being printed as pointers, not containers.

Shortcomings and Tests:

In spite of its slow and *not* constant-time file insertion tool, Unidatab’s best use is in one by one record addition; it has been tested as a valuable db for personal administration and installed as a handy replacement of a classical database application for a small association’s library (6’000 records with 43 fields – or “columns” – which represent a 22MB database).

Field content is not repeated in Unidatab so the size cost of horizontal elasticity is regained as the database grows.

Unidatab is clearly not good at what classic and big databases are made for, it is appropriate for more stable and complex objects in fewer occurences.  Every time a fixed number of compartments suffices to distinguish things, there’s no need for Unidatab.

The underlying Sqlite3 (one-file) structure makes each Unidatab db easily movable and sharable.

Insertion is by far the most expensive operation in Unidatab. Tests previously published on this page where optimistic because they related to an earlier version where field content was indeed repeated for quicker insertion. But this hybrid “fatn’fast” version naturally led to various kinds of issues in trying to renormalize the database afterwards and has been abandoned. At present (end 2017) the only test is the humble 5’986 entries library mentionned above. For this achievement, the insertion time was however quite long : 45 minutes to inject the TSV file in a database loaded in-memory at a processor speed of 2.4GHz – but only 7 minutes under the optimized version of Unidatab. Once the db created however, its search functions don’t seem to take too long for such a small amount of data, and they show some interesting specificities. I assume that Unidatab would still work correctly – i.e. not too slowly – on bigger files, probably up to something like 20’000 entries but new tests have to be done in order to verify that point and the results would be sensitive to parameters like the data/node ratio which changes according to the frequency of repetitions. But recursion, which is one of the main asset of Unidatab (the ability to have another record as content of a field), is not possible when inserting data from a TSV file (tab-separated values) and can only be done manually at present.

Unidatab is the first stage of a concept to develop, improve and share. It would require some expert advice and peer reviews too. Critics and contributors are welcome.

external link: Unidatab Github repository
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s