Re: Re: From TODO, XML? - Mailing list pgsql-hackers
From | Gavin Sherry |
---|---|
Subject | Re: Re: From TODO, XML? |
Date | |
Msg-id | Pine.LNX.4.21.0107301513110.27111-100000@linuxworld.com.au Whole thread Raw |
In response to | Re: Re: From TODO, XML? (mlw <markw@mohawksoft.com>) |
Responses |
Re: Re: From TODO, XML?
|
List | pgsql-hackers |
On Mon, 30 Jul 2001, mlw wrote: > Bruce Momjian wrote: > > > > > > I would find it very helpful to see a table of what sorts of XML > > > > functionality each major vendor supports. > > > > > > Actually I was thinking of databases of data, not database systems. > > > > I think we can go two ways. Allow COPY/pg_dump to read/write XML, or > > write some perl scripts to convert XML to/from our pg_dump format. The > > latter seems quite easy and fast. > > I have managed to get several XML files into PostgreSQL by writing a parser, > and it is a huge hassle, the public parsers are too picky. I am thinking that a > fuzzy parser, combined with some intelligence and an XML DTD reader, could make > a very cool utility, one which I have not been able to find. I have had the same problem. The best XML parser I could find was the gnome-xml library at xmlsoft.org (libxml). I am currently using this in C to replicate a client's legacy Notes system on to Postgres. In this case I was lucky in as much as I had some input on the XML namespace etc. XML was used because they had already designed an XML based dump utility. However, the way XML is being used is very basic. Only creation of tables, insert and delete are handled. Libxml works fine with this however, handling DTD/XML parsing, UTF-8, UTF-16 and iso-8859-1, validation etc. The main problem then is that every vendor has a different xml name space. If people really want to pursue this, the best thing to do would be to try to work with other open source database developers and design a suitable XML namespace for open source databases. Naturally, there will be much contention here about he most suitable this and that. It will be difficult to get a real spec going and will probably be much more trouble than it is worth. As such, if this fails, then we cannot expect Oracle, IBM, Sybase, MS and the rest to ever do it. Perhaps then it would be sufficient for pg_dump/restore to identify the name space of a given database dump and parse it according to that name space. Based on command-line arguments, pg_restore/dump could either die/ignore/transmogrify instructions in the XML which PG does not support or recognise. It would also be useful if pg_dump could dump data from postgres in the supported XML namespaces. So it essentially comes down to how useful it will be and who has time to code it up =) (as always). **Creative Solution** For those who have too much time on their hands and have managed to untangle some of the syntax in the W3C XSLT 1.0 specification, how about an XSL stylesheet to transform an XML based database dump from some third party into (postgres) SQL. Erk! There would have to be an award for such a thing ;-). Gavin
pgsql-hackers by date: