distibuted transactions, SQL+XPath+XTree - Mailing list pgsql-hackers
From | Тюрин Дмитрий |
---|---|
Subject | distibuted transactions, SQL+XPath+XTree |
Date | |
Msg-id | 1582589687.20080220082204@narod.ru Whole thread Raw |
Responses |
Re: distibuted transactions, SQL+XPath+XTree
|
List | pgsql-hackers |
Hi list, I see the following business opportunity for Postgres: I) Simple man can't program middleware to connect XML-client and Postgres. II) Request into several databases does not exist. III) Notebooks need several switching-on and switching-off during transaction. IV) Distance between strings are not supported, that makes aproximate searching impossible. V) There is no possibility to hide some (not all) records of table, granted to other users, from these users Proposed solutions to these opportunities: I) DBMS inserts into tables and selects from tables along correlation Primary Key - Foreign Key. II) Databases get nicknames, groups of databases get name of group. These names are used in requests. III) Operator to freeze transaction. IV) Operator to order records on base of distance between strings. V) Subdivide records of all tables into classes, specify class number in record. I ask you to implement these solutions, that Postgres get advantage before other DBMS-es. I have prepered several drawing http://sql50.euro.ru/sql5.11.3.ppt to explain ideas. More details are described below. === 1) [slides 2-12] Problem: Browser is very widespread client in epoch of internet. Non-programmers can master 'insert', 'select', 'update', 'delete', but are not capable to use sophisticated syntax of proprietary web-server for input of XML. It's necessary to exclude this syntax, and give possibility to install DBMS and immediately use it, like user install and use Teleport, FlashGet, browser and so on. Solution: DBMS itself must communicate via HTTP, accept XML, and place data from it into tables under some agreement. My proposal about agreement: *) xml-element is written into table with identical name (i.e. tag name coincides with table name), xml-attribute is written into field with identical name (i.e. attribute name coincides with field name); *) tables are bound into tree by values of Primary Keys and Foreign Keys; *) value of primary keys of new record is assigned by trigger. If user uses simple scheme, than this is enough! An ambiguity can exist in complex scheme because of several refering fields, than user must append symbol '#' and name of necessary refering field to end of name of sending XML-tag (it looks like new tag name with symbol '#' inside name). Let's name this by term 'determination' [symbol '$' is used for list to have possibility to solve ambiguity for list simultaneously with ambiguity for enclosed XML-element, i.e. to append two refering field to name of sending XML-tag]. P.S. [slides 13-21] Of course, we spead decision to manual 'insert'. 2) [slides 22-31] Problem: Usage of both SQL/XML-functions, and syntax of proprietary web-server give very bulky code to extract tree as XML. This makes more difficulties for contact of DBMS on CML, GML, HumanML, OPML, RCML, SBML, ebXML, MDDL, RIXML, XBRL, xCBL and other (turing all relational fields into XML-elements is suitable for browser, but not suitable for other cases). Solution: To avoid sophisticated programing, 'select' itself must return data to client (if only 'select' is not used inside 'insert ... select ...'). I propose laconic 'select a.b.c' to select data from tables 'a', 'b', 'c'. Let's name this by term 'XTree' - in analogy with 'XPath'. If user uses simple scheme, than this is enough! An ambiguity can exist in complex scheme because of several refering fields, than user must append symbol '#' and name of necessary refering field to end of table name (it looks like new table name with symbol '#' inside name). Let's name this by term 'refinement'. P.S. [slides 32-39] All possible compositions of determinations in XML-tree and all possible compositions of refinement in 'select' are considered, appropriate XML- and SQL-syntaxes are proposed. P.S. [slides 40-49] Examples of usage of refinement are demonstrated. 3) [slides 50-58] Problem: Non-predictable/non-repeated input data (XML-elements) is written into XML-field of relational table. XQuery is offered to process data in these xml-fields. But user is not capable to manipulate records by SQL and XML-elements by XQuery in one request (even in case of refusal from relational storage in favour of XML-database, that means in favour of non-relational 'engine', enclosed cycles of XQuery create very bulky code, in which user is not orientated). Solution: I propose to append XPath into SQL, that SQL can process XML-elements and attributes (i.e. to avoid XQuery). Thus SQL can process records and XML-elements simultaneously. P.S. [slides 59-72] Of course, we generalize XPath and XTree upto XLang, and consider all possible use cases. --- 4) [slides 73-83-116] Problem: SQL would more flexible and convenient for distributed request (gethering data from several databases and scattering them into several databases), than branded programs; including SQL is more convenient for replication, than branded programs. But there is no necessary syntax. Solution: Each database has nickname. Nicknames are specified in requests as prefix before table name. Group of databases is named society. Name of society also can be specified as prefix before table name, and means nicknames of all databases of group. Thus one SQL-statement, containing society name, means a great number of SQL-statements with nicknames. That nicknames, several societies or several mentions of one society don't specify the same database simultaneously, we place symbol '%' before them (let's name so prefix as restricted prefix). That several mentions of one society synchronously specify the same database, we place any (identical) word and symbol '%' before this mentions (let's name this word as marker, and this prefix as marked prefix). 'Default' database is database, in which all nicknames and societies are stored. And prefix 'all' means all databases, known for default database. Nickname can has numeric parameter NID (nick identifier). It is not accessable to change in requests to process data, and is designated as '%%'. In purpose of security, distributed requests must satisfy some requirements. I propose whole mistrust to DBMS: *) database does not store login of other database (that to not give foreign login at crack) *) database does not edit (update, insert, delete) other database (that temporaty access to other DBMS, got at crack, can'tdestroy other database) *) database does not get data from other database, if it’s possible (that data from other database not become accessabletoo at crack) And i also propose to expect quite simplicity of client: *) client does not know SQL (it can’t simplify or decompose SQL). So DBMS-1 can't create and enter SQL-command into DBMS-2 directly or indirectly (asking client to forward command). And client can't derive SQL-command on base of entered SQL-command (all, what it has, is last SQL-command, stored in own stack). I propose to DBMS-1 to transfer __XML__-commands to client, which force client to make simple (string) transformation of SQL-command, stored in client stack, and send result of transformation into DBMS-2. Transformations must be so limited, that to not allow appearance of SQL-command, harm for DBMS-2. I propose to arrange these XML-commands as <?name?> to distinguish them from XML-data (traditionaly arranged as <name>). 5) [slide 118] Problem: User makes transaction from notebook, and needs to switch-off notebook without commit or rollback of transaction to continue transaction from left stage at next switching-on. Solution: I propose command 'freeze', similar to command 'disconnet', which save transaction in current state; and command 'unfreeze', similar to command 'connect', which continue frozen transaction (instead to start new transaction). 'Freeze' returns identifier of frozen transaction, which should be used in 'Unfreeze'. P.S. [slide 119] Now savepoint can be used only to rollback to it. I propose command 'commit savepoint' (commit all actions, made before savepoint), that is useful in much cases before command 'freeze'. 6) [slide 120] Problem: Commiting of distrubited transaction (being executed in several databases) is not fails in all databases at once. It's not reasonable to rollback transaction in databases, which remain healthy, to begin transaction in them from very beginning - it's reasonable to wait repearing of failed databases to commit transaction together with them. So we need to freeze command 'commit' in healthy databases in process of executing it (as well as in case of freezing transaction, let client messages will be SQL-commands, and server messages will be XML-commands). Solution: I propose to enter client message 'postpone' to freeze commit on second phase, and client message 'adjourn' to freeze on third phase. 7) [slide 124] Problem: At stream processing (when new records enter quickly, in much quantity), it's necessary to execute aggregate only on several last entered records (to organize slip slot), but creating index on field, sequencing records, sorting on this index with purpose to cut only needed quantity of records brakes processing of stream. Solution: Limit quantity of records in a table, make queue of records, automatically delete records from beginning of queue at arrival of new records - and start aggregates for all records of such specially orginized table. To save records, automatically deleted from beginning of the queue, it's possible to copy them automatically into other usual table (which will save them permanently). 8) [slides 125-129] Problem: Distance between strings are not supported, that makes search of similar strings and ordering by degree of resemblance. Solution: Method of calculation of distance between strings and operator, ordering records by this factor. 9) [slide 132] Problem: No possibility to make some (personal) records unaccessable for other users. Solution: Subdivide records of all tables into not crossing classes, specify number of class in special field of records. Dmitry Turin HTML6 (6. 5.4) http://html60.euro.ru SQL5 (5.11.3) http://sql50.euro.ru Unicode7 (7. 2.1) http://unicode70.euro.ru Computer2 (2. 0.2) http://computer20.euro.ru
pgsql-hackers by date: