A plan to improve error messages with context, hint and details. - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | A plan to improve error messages with context, hint and details. |
Date | |
Msg-id | Pine.LNX.4.58.0403041742590.28778@sablons.cri.ensmp.fr Whole thread Raw |
Responses |
Re: A plan to improve error messages with context, hint
Re: A plan to improve error messages with context, hint and details. |
List | pgsql-hackers |
Dear Hackers, Motivation ---------- As a basic user of postgresql, I've been quite disappointed by the lack of help provided by postgresql error messages dealing with syntax and semantical errors, especially in long sql statements: ERROR: syntax error at or near "(" at character 326 This makes students feel angry against the software, the computer or even the teacher (say, me;-), for the bad time they have dealing with syntactic issues. It makes them turn to mysql;-) I think it is important an issue, as the interface is the first contact between the user and the software. The internals and features may be great, but it is not bad either to help users to deal more easily with the beast. There are several points on which great improvements can be obtained without much disruption in the current code structure. Note that this plan is for "ERROR", where the processing is somehow stopped, and the user must chose a course of action in order to solve the problem. However, other level of reports (such as NOTICE or WARNING) may also benefit from it. Also, things may not be as simple as described, but the purpose of the mail is to describe the ultimate goal, and to outline the path that I think should lead to it. So here is my suggested plan: (1) Lexical/syntax error source localisation -------------------------------------------- An extract of the offending source must be shown if possible along syntax error messages. This can be achieved very simply and at low cost, since all the information is already there, as well as most needed fields in ErrorData. However it may be required to handle multi-line details or an additionnal sub-detail (I would chose that) in ErrorData in order to show a cursor: ERROR: Syntax error at or near "(" at character 14DETAIL: CREATE TABLE (id SERIAL ...DETAIL: ^ The only actual issue seems to be multi-byte encodings in the buffer, but I noticed that some support functions are already available. (2) Hints about syntax errors ----------------------------- All generated error messages, especially from the parser, should be assorted with a HINT to help the user, if possible. Something like: HINT: table name expected This requires more work as all syntax error sources need to be catched and a relevant HINT must be provided. There is a little bit of an issue here as yyerror call to ereport is rather simplistic. I would suggest to have a "current_hint" (scalar or maybe stack) maintained by the parsor, that would be used by yyerror to fill the hint field. The yacc code may look something like: <code> CreateUserStmt: Create USER { hint("user id"); } Userid { hint("user options or WITH"); } opt_with { hint("user options");} OptUserList { ... }; Create: CREATE { hint("USER|DATABASE|SCHEMA|..."); }; </code> The changes are pretty systematic and simple, and they do not modidy the actual grammar of the parsor. However they should affect a lot of lines in "gram.y", if not all. (3) About semantical errors, which may be detected later on ... --------------------------------------------------------------- ... in the processing of the command. The problem is different because it occurs in functions that can be called from quite different contexts, and the context is not really known to the function. Thus when the error occurs in the function, it cannot provide a useful context. As none is provided, the user must guess... For this I have used in the past the following trick: a stack describes the processing context and is updated by functions with pushes and pops. If an error occurs, the stack provides the context information needed, something like: CONTEXT: parsing user query Or CONTEXT: in create table "foo", in constraint "bla", checking reference types ... This is basically a user-oriented view of the call stack to help with error messages. It's really incremental, as if nothing is done the context will be vague, but if some key functions care to update it precisely, the context reported will be much more helpful. It should typically be maintained in the error logging part, and used on errors to build a context if none is provided, or to be provided as a separate content next to "message, detail, hint, context". Also care must be taken to reset the stack on errors. Note that the current "context" management in elog allows multiple context information to be provided, but I haven't seen any "pop" facility which would be needed by functions so as to change the current context simply, the strings are just appended one to the other. Two questions ------------- My research background is in code optimisation within compilers, but now I mostly teach computer science stuff in engineering schools. I would be interested in giving some of my time on these issues, but: (1) Do postgresql "Masters" think this issue is worth being pursued, or any patch will be rejected as it is consideredintrinsicly useless? "Our users do not need hint or context information, only hard-core engineers use postgresql, sissy guys will ratheruse mysql" ;-) Indeed, I don't mind having a patch being rejected because of my poor programming, or because the result is not fineenough, as I can improve it and re-submit later. However, if the issue is considered useless, it means that I will lose my time anyway, so I would prefer not to giveit try;-) (2) Does someone has any comment about these problems or the way I intend to try to address them? Are they currently being addressed by someone else? It doesn't look so from the TODO list. If the plan make sense, it may be added to the TODO list, and I wish to claim it or part of it. Have a nice day, -- Fabien Coelho - coelho@cri.ensmp.fr
pgsql-hackers by date: