Re: large object seek/write bug - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: large object seek/write bug |
Date | |
Msg-id | 18322.961050495@sss.pgh.pa.us Whole thread Raw |
In response to | (Ian Grant <Ian.Grant@cl.cam.ac.uk>) |
Responses |
Re: large object seek/write bug
|
List | pgsql-bugs |
Ian Grant <Ian.Grant@cl.cam.ac.uk> writes: > I'm using V 7.0 on a Linux machine and I believe I have found a bug in the > large object interface provided by libpq. Thank you for the carefully developed test case. The bug is actually in the backend, not in libpq: there's an ancient hack in inv_write that looks at rel->rd_nblocks to decide if the large object is empty. Unfortunately rd_nblocks isn't maintained very carefully, so the test might mistakenly consider a recently-created LO to be empty, leading to the Wrong Thing. But that hack is no longer necessary, since the index bug it was trying to work around was swatted ages ago; diking out the check suffices to fix it. Another error I noticed while testing the fix is that inv_read doesn't cope gracefully with "holes" in the large object, ie, ranges never written because of a seek and write past the previous end of file. With the attached patch, reads of "holes" reliably return zeroes, as is considered standard behavior for Unix files. (The garbage data you saw was actually from this error, although the case would not have been triggered if not for the first bug.) I believe the attached patch fixes these problems, but I haven't been able to wring it out very thoroughly because I don't have applications that do random seeks and writes in large objects. If you could bang on it a little more and report back, that'd be great. regards, tom lane *** src/backend/storage/large_object/inv_api.c.orig Wed Apr 12 13:15:37 2000 --- src/backend/storage/large_object/inv_api.c Thu Jun 15 02:10:27 2000 *************** *** 185,190 **** --- 185,191 ---- retval->idesc = RelationGetDescr(indr); retval->offset = retval->lowbyte = retval->highbyte = 0; ItemPointerSetInvalid(&(retval->htid)); + retval->flags = 0; if (flags & INV_WRITE) { *************** *** 233,238 **** --- 234,240 ---- retval->idesc = RelationGetDescr(indrel); retval->offset = retval->lowbyte = retval->highbyte = 0; ItemPointerSetInvalid(&(retval->htid)); + retval->flags = 0; if (flags & INV_WRITE) { *************** *** 371,384 **** if (whence == SEEK_CUR) { offset += obj_desc->offset; /* calculate absolute position */ - return inv_seek(obj_desc, offset, SEEK_SET); } ! ! /* ! * if you seek past the end (offset > 0) I have no clue what happens ! * :-( B.L. 9/1/93 ! */ ! if (whence == SEEK_END) { /* need read lock for getsize */ if (!(obj_desc->flags & IFS_RDLOCK)) --- 373,380 ---- if (whence == SEEK_CUR) { offset += obj_desc->offset; /* calculate absolute position */ } ! else if (whence == SEEK_END) { /* need read lock for getsize */ if (!(obj_desc->flags & IFS_RDLOCK)) *************** *** 389,396 **** offset += _inv_getsize(obj_desc->heap_r, obj_desc->hdesc, obj_desc->index_r); - return inv_seek(obj_desc, offset, SEEK_SET); } /* * Whenever we do a seek, we turn off the EOF flag bit to force --- 385,392 ---- offset += _inv_getsize(obj_desc->heap_r, obj_desc->hdesc, obj_desc->index_r); } + /* now we can assume that the operation is SEEK_SET */ /* * Whenever we do a seek, we turn off the EOF flag bit to force *************** *** 414,422 **** * stores the offset of the last byte that appears on it, and we have * an index on this. */ - - - /* right now, just assume that the operation is SEEK_SET */ if (obj_desc->iscan != (IndexScanDesc) NULL) { d = Int32GetDatum(offset); --- 410,415 ---- *************** *** 424,430 **** } else { - ScanKeyEntryInitialize(&skey, 0x0, 1, F_INT4GE, Int32GetDatum(offset)); --- 417,422 ---- *************** *** 487,495 **** /* copy the data from this block into the buffer */ d = heap_getattr(&tuple, 2, obj_desc->hdesc, &isNull); ReleaseBuffer(buffer); ! fsblock = (struct varlena *) DatumGetPointer(d); off = obj_desc->offset - obj_desc->lowbyte; ncopy = obj_desc->highbyte - obj_desc->offset + 1; --- 479,505 ---- /* copy the data from this block into the buffer */ d = heap_getattr(&tuple, 2, obj_desc->hdesc, &isNull); + fsblock = (struct varlena *) DatumGetPointer(d); ReleaseBuffer(buffer); ! /* ! * If block starts beyond current seek point, then we are looking ! * at a "hole" (unwritten area) in the object. Return zeroes for ! * the "hole". ! */ ! if (obj_desc->offset < obj_desc->lowbyte) ! { ! int nzeroes = obj_desc->lowbyte - obj_desc->offset; ! ! if (nzeroes > (nbytes - nread)) ! nzeroes = (nbytes - nread); ! MemSet(buf, 0, nzeroes); ! buf += nzeroes; ! nread += nzeroes; ! obj_desc->offset += nzeroes; ! if (nread >= nbytes) ! break; ! } off = obj_desc->offset - obj_desc->lowbyte; ncopy = obj_desc->highbyte - obj_desc->offset + 1; *************** *** 535,548 **** Buffer buffer; /* ! * Fetch the current inversion file system block. If the class ! * storing the inversion file is empty, we don't want to do an ! * index lookup, since index lookups choke on empty files (should ! * be fixed someday). */ ! if ((obj_desc->flags & IFS_ATEOF) ! || obj_desc->heap_r->rd_nblocks == 0) tuple.t_data = NULL; else inv_fetchtup(obj_desc, &tuple, &buffer); --- 545,555 ---- Buffer buffer; /* ! * Fetch the current inversion file system block. We can skip ! * the work if we already know we are at EOF. */ ! if (obj_desc->flags & IFS_ATEOF) tuple.t_data = NULL; else inv_fetchtup(obj_desc, &tuple, &buffer);
pgsql-bugs by date: