Re: bytea memory improvement - Mailing list pgsql-jdbc
From | Luis Vilar Flores |
---|---|
Subject | Re: bytea memory improvement |
Date | |
Msg-id | 44EB8C82.1000203@evolute.pt Whole thread Raw |
In response to | bytea memory improvement (Luis Vilar Flores <lflores@evolute.pt>) |
Responses |
Re: bytea memory improvement
Re: bytea memory improvement |
List | pgsql-jdbc |
Hello, This time I believe to have all tests and source needed to have the patch accepted. To all that already forgot the first emails, I developed an modified version of the method toBytes from the org.postgresql.util.PGbytea class. The old method uses 3 buffers to translate the data from the nework to the client, this uses too much memory. My method only uses 2 buffers, but does one more pass through the original buffer (to calculate it's final size). Bellow is a table with times and memory usage of the 2 methods, using the supplied ByteaTest class: OLD method: size: 0.5MB execute+next: 49ms getBytes: 18ms used mem: 74505KB size: 1.5MB execute+next: 94ms getBytes: 53ms used mem: 48004KB size: 2.5MB execute+next: 147ms getBytes: 110ms used mem: 23537KB size: 3.5MB execute+next: 244ms getBytes: 190ms used mem: 24504KB size: 4.5MB execute+next: 306ms getBytes: 224ms used mem: 31448KB size: 5.5MB execute+next: 364ms getBytes: 267ms used mem: 38392KB size: 6.5MB execute+next: 413ms getBytes: 308ms used mem: 45336KB size: 7.5MB execute+next: 464ms getBytes: 306ms used mem: 52281KB size: 8.5MB execute+next: 511ms getBytes: 349ms used mem: 59225KB size: 9.5MB execute+next: 804ms getBytes: 377ms used mem: 66169KB size: 10.5MB execute+next: 634ms getBytes: 546ms used mem: 73112KB size: 11.5MB execute+next: 689ms getBytes: 450ms used mem: 80057KB size: 12.5MB execute+next: 748ms getBytes: 482ms used mem: 87001KB size: 13.5MB execute+next: 820ms getBytes: 514ms used mem: 93945KB size: 14.5MB execute+next: 865ms getBytes: 734ms used mem: 100888KB size: 15.5MB execute+next: 921ms getBytes: 586ms used mem: 107833KB size: 16.5MB execute+next: 1003ms getBytes: 619ms used mem: 114777KB size: 17.5MB execute+next: 1030ms getBytes: 652ms used mem: 121721KB size: 18.5MB execute+next: 1102ms getBytes: 927ms used mem: 128664KB size: 19.5MB execute+next: 1166ms getBytes: 723ms used mem: 135609KB size: 20.5MB execute+next: 1217ms getBytes: 735ms used mem: 142583KB size: 21.5MB execute+next: 1284ms getBytes: 766ms used mem: 149527KB size: 22.5MB execute+next: 1437ms getBytes: 801ms used mem: 156471KB size: 23.5MB execute+next: 1425ms getBytes: 833ms used mem: 163415KB size: 24.5MB execute+next: 1453ms getBytes: 866ms used mem: 170359KB size: 25.5MB execute+next: 1766ms getBytes: 902ms used mem: 177303KB size: 26.5MB execute+next: 2004ms getBytes: 939ms used mem: 184247KB size: 27.5MB execute+next: 1650ms getBytes: 968ms used mem: 191191KB size: 28.5MB execute+next: 1757ms getBytes: 796ms used mem: 198105KB size: 29.5MB execute+next: 1770ms getBytes: 1040ms used mem: 205086KB size: 30.5MB execute+next: 1820ms getBytes: 1074ms used mem: 212030KB size: 31.5MB execute+next: 1869ms getBytes: 1109ms used mem: 218974KB size: 32.5MB execute+next: 1930ms getBytes: 1146ms used mem: 225918KB size: 33.5MB execute+next: 2183ms getBytes: 1177ms used mem: 232862KB size: 34.5MB execute+next: 2241ms getBytes: 1221ms used mem: 239806KB NEW method: size: 0.5MB execute+next: 50ms getBytes: 19ms used mem: 73137KB size: 1.5MB execute+next: 90ms getBytes: 50ms used mem: 43760KB size: 2.5MB execute+next: 149ms getBytes: 97ms used mem: 16136KB size: 3.5MB execute+next: 237ms getBytes: 113ms used mem: 14170KB size: 4.5MB execute+next: 302ms getBytes: 174ms used mem: 18127KB size: 5.5MB execute+next: 357ms getBytes: 234ms used mem: 22110KB size: 6.5MB execute+next: 602ms getBytes: 232ms used mem: 26095KB size: 7.5MB execute+next: 477ms getBytes: 265ms used mem: 30079KB size: 8.5MB execute+next: 532ms getBytes: 296ms used mem: 34063KB size: 9.5MB execute+next: 590ms getBytes: 385ms used mem: 38046KB size: 10.5MB execute+next: 648ms getBytes: 357ms used mem: 42031KB size: 11.5MB execute+next: 695ms getBytes: 391ms used mem: 46015KB size: 12.5MB execute+next: 765ms getBytes: 423ms used mem: 49999KB size: 13.5MB execute+next: 825ms getBytes: 542ms used mem: 53982KB size: 14.5MB execute+next: 874ms getBytes: 491ms used mem: 57967KB size: 15.5MB execute+next: 931ms getBytes: 521ms used mem: 61951KB size: 16.5MB execute+next: 992ms getBytes: 551ms used mem: 65935KB size: 17.5MB execute+next: 1063ms getBytes: 694ms used mem: 69918KB size: 18.5MB execute+next: 1111ms getBytes: 618ms used mem: 73903KB size: 19.5MB execute+next: 1168ms getBytes: 649ms used mem: 77887KB size: 20.5MB execute+next: 1230ms getBytes: 654ms used mem: 81903KB size: 21.5MB execute+next: 1289ms getBytes: 687ms used mem: 85890KB size: 22.5MB execute+next: 1345ms getBytes: 737ms used mem: 89875KB size: 23.5MB execute+next: 1415ms getBytes: 751ms used mem: 93861KB size: 24.5MB execute+next: 1461ms getBytes: 782ms used mem: 97846KB size: 25.5MB execute+next: 1521ms getBytes: 817ms used mem: 101833KB size: 26.5MB execute+next: 1587ms getBytes: 848ms used mem: 105817KB size: 27.5MB execute+next: 1634ms getBytes: 877ms used mem: 109804KB size: 28.5MB execute+next: 1692ms getBytes: 931ms used mem: 113789KB size: 29.5MB execute+next: 1748ms getBytes: 944ms used mem: 117775KB size: 30.5MB execute+next: 1820ms getBytes: 972ms used mem: 121760KB size: 31.5MB execute+next: 1869ms getBytes: 1005ms used mem: 125747KB size: 32.5MB execute+next: 1915ms getBytes: 1038ms used mem: 129731KB size: 33.5MB execute+next: 1983ms getBytes: 1088ms used mem: 133718KB size: 34.5MB execute+next: 2055ms getBytes: 1103ms used mem: 137703KB As you can see the execution time remained almost the same (small gain on the new version), but memory usage is drastically improved. These times were obtained in a Celeron M 1.6GHz laptop with 1GB RAM, running Fedora Core 5, Java 1.5.0_08 and Postgresql 8.1.4. In attach I supply the modified PGbytea.java, the patch versus 8.1-407 source (the 8.2dev-503 is the same), and the test program ByteaTest.java. The test program also validates the correctness of the result through CRC32. Hope to hear some feedback soon, hope I didn't forget anything ... Luis Flores /* * ByteaTest.java * * Created on August 22, 2006, 8:09 PM * * To change this template, choose Tools | Template Manager * and open the template in the editor. */ package com.evolute.postgresql.test; import java.sql.*; import java.util.zip.*; /** * * @author lvflores */ public class ByteaTest { private static String url = "jdbc:postgresql://localhost/test_bytea"; private static String user = "postgres"; private static String password = ""; private static final int RUNS = 35; private final CRC32 crcCalc = new CRC32(); private Connection connection = null; private long crc32[] = new long[ RUNS ]; /** Creates a new instance of ByteaTest */ public ByteaTest( String pgUrl, String pgUser, String pgPass ) throws Exception { Class.forName( "org.postgresql.Driver" ); connection = DriverManager.getConnection( pgUrl, pgUser, pgPass ); createTable(); test(); deleteTable(); connection.close(); } private void createTable() throws Exception { Statement stm = connection.createStatement(); try { stm.executeUpdate( "DROP TABLE test_bytea" ); } catch( Exception ex ) { } stm.executeUpdate( "CREATE TABLE test_bytea (" + "id INT NOT NULL, " + "data BYTEA )" ); stm.close(); } private void test() throws Exception { // insert for( int i = 0; i < RUNS; ++i ) { PreparedStatement pstm = connection.prepareStatement( "INSERT INTO test_bytea ( id, data ) VALUES ( ?, ? )" ); pstm.setInt( 1, i ); byte data[] = generateBytes( i ); // long l = System.currentTimeMillis(); pstm.setBytes( 2, data ); // long l1 = System.currentTimeMillis(); pstm.execute(); // long l2 = System.currentTimeMillis(); long usedMem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(); // System.out.println( "size: " + i + ".5MB setBytes: " + ( l1 -l ) // + "ms execute: " + ( l2 -l1 ) + "ms used mem: " + ( usedMem / 1024 ) + "KB" ); pstm.close(); System.gc(); } // select and check for( int i = 0; i < RUNS; ++i ) { PreparedStatement pstm = connection.prepareStatement( "SELECT data FROM test_bytea WHERE ID = ?" ); pstm.setInt( 1, i ); long l = System.currentTimeMillis(); ResultSet rs = pstm.executeQuery(); rs.next(); long l1 = System.currentTimeMillis(); byte data[] = rs.getBytes( 1 ); long l2 = System.currentTimeMillis(); crcCalc.reset(); crcCalc.update( data ); long crc = crcCalc.getValue(); if( crc != crc32[ i ] ) { System.out.println( "WRONG DATA on idx " + i ); } long usedMem = Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(); System.out.println( "size: " + i + ".5MB execute+next: " + ( l1 -l ) + "ms getBytes: " + ( l2 -l1 ) + "ms used mem: " + ( usedMem / 1024 ) + "KB" ); pstm.close(); System.gc(); } } private byte[] generateBytes( int idx ) { byte data[] = new byte[ 1024 * 1024 * idx + 512 * 1024 ]; for( int i = 0; i < data.length; ++i ) { data[ i ] = ( byte )i; } crcCalc.reset(); crcCalc.update( data ); crc32[ idx ] = crcCalc.getValue(); return data; } private void deleteTable() throws Exception { Statement stm = connection.createStatement(); stm.executeUpdate( "DROP TABLE test_bytea" ); stm.close(); } public static void main( String arg[] ) throws Exception { if( arg.length != 0 && arg.length != 3 ) { System.err.println( "Usage: java com.evolute.postgresql.test.ByteaTest <URL> <USER> <PASSWORD>" ); } else if( arg.length == 3 ) { url = arg[ 0 ]; user = arg[ 1 ]; password = arg[ 2 ]; } new ByteaTest( url, user, password ); } } /*------------------------------------------------------------------------- * * Copyright (c) 2003-2005, PostgreSQL Global Development Group * * IDENTIFICATION * $PostgreSQL: pgjdbc/org/postgresql/util/PGbytea.java,v 1.12 2005/01/11 08:25:49 jurka Exp $ * *------------------------------------------------------------------------- */ package org.postgresql.util; import java.sql.*; /** * Converts to and from the postgresql bytea datatype used by the backend. */ public class PGbytea { private static final int MAX_3_BUFF_SIZE = 0; /* * Converts a PG bytea raw value (i.e. the raw binary representation * of the bytea data type) into a java byte[] */ public static byte[] toBytes(byte[] s) throws SQLException { if (s == null) return null; final int slength = s.length; byte[] buf = null; int correctSize = slength; if (slength > MAX_3_BUFF_SIZE) { // long l = System.currentTimeMillis(); // count / * // for (int i = 0; i < slength; ++i) { byte current = s[i]; if (current == '\\') { byte next = s[ ++i ]; if (next == '\\') { --correctSize; } else { correctSize -= 3; } } } //System.out.println( "TOOK: " + ( System.currentTimeMillis() - l ) + "ms - SAVED " + ( slength - correctSize ) + " ON SECONDBUFFER" ); buf = new byte[correctSize]; } else { buf = new byte[slength]; } int bufpos = 0; int thebyte; byte nextbyte; byte secondbyte; for (int i = 0; i < slength; i++) { nextbyte = s[i]; if (nextbyte == (byte)'\\') { secondbyte = s[++i]; if (secondbyte == (byte)'\\') { //escaped \ buf[bufpos++] = (byte)'\\'; } else { thebyte = (secondbyte - 48) * 64 + (s[++i] - 48) * 8 + (s[++i] - 48); if (thebyte > 127) thebyte -= 256; buf[bufpos++] = (byte)thebyte; } } else { buf[bufpos++] = nextbyte; } } if (bufpos == correctSize) { //System.out.println( "SKIPPED LAST BUFFER" ); return buf; } byte[] l_return = new byte[bufpos]; System.arraycopy(buf, 0, l_return, 0, bufpos); return l_return; } /* * Converts a java byte[] into a PG bytea string (i.e. the text * representation of the bytea data type) */ public static String toPGString(byte[] p_buf) throws SQLException { if (p_buf == null) return null; StringBuffer l_strbuf = new StringBuffer(2 * p_buf.length); for (int i = 0; i < p_buf.length; i++) { int l_int = (int)p_buf[i]; if (l_int < 0) { l_int = 256 + l_int; } //we escape the same non-printable characters as the backend //we must escape all 8bit characters otherwise when convering //from java unicode to the db character set we may end up with //question marks if the character set is SQL_ASCII if (l_int < 040 || l_int > 0176) { //escape charcter with the form \000, but need two \\ because of //the parser l_strbuf.append("\\"); l_strbuf.append((char)(((l_int >> 6) & 0x3) + 48)); l_strbuf.append((char)(((l_int >> 3) & 0x7) + 48)); l_strbuf.append((char)((l_int & 0x07) + 48)); } else if (p_buf[i] == (byte)'\\') { //escape the backslash character as \\, but need four \\\\ because //of the parser l_strbuf.append("\\\\"); } else { //other characters are left alone l_strbuf.append((char)p_buf[i]); } } return l_strbuf.toString(); } } --- /home/lvflores/Desktop/postgresql-jdbc-8.1-407.src/org/postgresql/util/PGbytea.java 2005-01-11 08:25:49.000000000+0000 +++ PGbytea.java 2006-08-22 23:37:53.000000000 +0100 @@ -16,8 +16,9 @@ */ public class PGbytea { - - /* + private static final int MAX_3_BUFF_SIZE = 0; + + /* * Converts a PG bytea raw value (i.e. the raw binary representation * of the bytea data type) into a java byte[] */ @@ -25,8 +26,36 @@ { if (s == null) return null; - int slength = s.length; - byte[] buf = new byte[slength]; + final int slength = s.length; + byte[] buf = null; + int correctSize = slength; + if (slength > MAX_3_BUFF_SIZE) + { +// long l = System.currentTimeMillis(); + // count / * // + for (int i = 0; i < slength; ++i) + { + byte current = s[i]; + if (current == '\\') + { + byte next = s[ ++i ]; + if (next == '\\') + { + --correctSize; + } + else + { + correctSize -= 3; + } + } + } +//System.out.println( "TOOK: " + ( System.currentTimeMillis() - l ) + "ms - SAVED " + ( slength - correctSize ) + " ON SECONDBUFFER" ); + buf = new byte[correctSize]; + } + else + { + buf = new byte[slength]; + } int bufpos = 0; int thebyte; byte nextbyte; @@ -55,6 +84,11 @@ buf[bufpos++] = nextbyte; } } + if (bufpos == correctSize) + { +//System.out.println( "SKIPPED LAST BUFFER" ); + return buf; + } byte[] l_return = new byte[bufpos]; System.arraycopy(buf, 0, l_return, 0, bufpos); return l_return; @@ -103,6 +137,4 @@ } return l_strbuf.toString(); } - - }
pgsql-jdbc by date: