Home > mailing lists

Re: behavior of PGtokenizer w/ escaped delim - Mailing list pgsql-jdbc

From	Dave Cramer
Subject	Re: behavior of PGtokenizer w/ escaped delim
Date	November 2, 2013 13:46:42
Msg-id	CADK3HHJbk01RYuaYmR39KrFeZOq4RuztDUeN-Fka_dePf7Tocw@mail.gmail.com Whole thread Raw
In response to	behavior of PGtokenizer w/ escaped delim (Ivonne Lopez <ivlo11@me.com>)
Responses	Re: behavior of PGtokenizer w/ escaped delim
List	pgsql-jdbc

Tree view

Ivonne,

Sorry for my late reply. Can you send a pull request so I can see the diff's? Does your fork pass all the rest of the tests ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On Thu, Oct 31, 2013 at 11:30 AM, Ivonne Lopez <ivlo11@me.com> wrote:

I have been working on a project that requires using array_agg of composites with left joins in a one to many relationship.

My issue comes when parsing that results back in jdbc using PGtokenizer. Initially after reading the comments in the file I still believed that it would be able to sparse my result correctly, however on further inspection the only nesting it supports is '(' ')' '[' ']' '<' and '>'. If there is a comma within the text, even though pg "escapes" it by surrounding that column in quotes, PGtokenizer doesn't honor this.

That being said, I add that support to my forked copy of pgjdbc https://github.com/ivlo11/pgjdbc/tree/nest_quotes_pgtokenizer and so far it works perfectly. I still have to commit in the junit test cases I used.

So here are my questions:

1. Am I naive to think this approach will work?
2. And if not, what are the possibilities of this getting pulled into pgjdbc's master?

If you want to reproduce this, a sample setup is below.

Thanks,
Ivonne

CREATE TABLE person
(
id serial,
name character(10),
CONSTRAINT pkey_person PRIMARY KEY (id)
);

CREATE TABLE car
(
id serial,
name character(10),
description character varying,
owner integer,
CONSTRAINT pkey_car PRIMARY KEY (id),
CONSTRAINT fkey_car FOREIGN KEY (owner) REFERENCES person (id)  ON UPDATE CASCADE ON DELETE CASCADE
);

INSERT INTO person (name) VALUES ('ivonne');
INSERT INTO car (name,description,owner) VALUES ('lexus','December 20, 2011',1),('honda','April 1, 2008',1),('mitsubishi','August 12, 1998',1);

SELECT person.name, array_agg(DISTINCT (car.id, car.name, car.description)) AS cars
FROM person
LEFT JOIN car ON (car.owner = person.id)
WHERE person.id = 1
GROUP BY person.name LIMIT 1

"ivonne ";"{"(1,\"lexus \",\"December 20, 2011\")","(2,\"honda \",\"April 1, 2008\")","(3,mitsubishi,\"August 12, 1998\")"}"

As you can see with the composite type some columns that can have quotes to deal with spaces and commas within them. Considering how escaping works in pg, I assumed that using PGTokenizer would be most efficient versus StringTokenizer.

@Test
public void testSelectPersonWithArrayCompositeLeftJoin() {
DBConnection dbconn = new DBConnection();
ResultSet rs;
try (Connection conn = dbconn.getConnection();)
{
rs = ORMPerson.selectPerson(conn, 1); // uses a prepared statement of the above select query
assertNotNull(rs);
assertTrue(rs.next());

Array cars = rs.getArray("cars");
assertNotNull(cars);

ResultSet rsCars = cars.getResultSet();
assertNotNull(rsCars);

while (rsCars.next()) {
String comp = rsCars.getString(2);
PGtokenizer token = new PGtokenizer(PGtokenizer.removePara(comp),',');
for (int i = 0; i < token.getSize(); i++) {
System.out.println(token.getToken(i));
}
System.out.println();
}
} catch (SQLException e) {
e.printStackTrace();
fail("SQLException returned, couldn't select person");
}
}

But this returns this:

1
"lexus     "
"December 20
2011"

2
"honda     "
"April 1
2008"

3
mitsubishi
"August 12
1998"

Which as you can see isn't right… the date is split into two "columns"

pgsql-jdbc by date:

From: Dave Cramer
Date: 01 November 2013, 16:51:04
Subject: Re: JDBC 9.3 released

From: Thomas Kellerer
Date: 04 November 2013, 07:18:07
Subject: Re: JDBC 9.3 released

Re: behavior of PGtokenizer w/ escaped delim - Mailing list pgsql-jdbc

Previous

Next