Re: Weird procedure question - Mailing list pgsql-general
From | digimer |
---|---|
Subject | Re: Weird procedure question |
Date | |
Msg-id | b724adcc-7cb2-cb61-a6d4-74c653955752@alteeve.ca Whole thread Raw |
In response to | Re: Weird procedure question (Tim Cross <theophilusx@gmail.com>) |
Responses |
Re: Weird procedure question
|
List | pgsql-general |
On 2018-09-25 6:22 p.m., Tim Cross wrote: > digimer <lists@alteeve.ca> writes: > >> Hi all, >> >> I've got an interesting use case that I am stuck on. It's a bit of a >> complicated environment, but I'll try to keep it simple. >> >> In short; I have a history schema that has tables that match the >> public schema, plus one 'history_id' column that has a simple sequential >> bigserial value. Then I have a procedure and trigger that, on UPDATE or >> INSERT, copies the data to history. Example use case is that I can >> UPDATE a sensor value in the public table and it's also INSERTs the data >> into history. So public shows just the most recent values, but I can see >> changes over time in the history schema. >> >> I have built my system to support writing to one or more DBs. I keep >> a list of connected DBs and send INSERT/UPDATE calls to a method that >> then runs the UPDATE/INSERT against all connected databases, as a form >> of redundancy. This all works fine. >> >> The problem I've hit is that the 'history_id' differs between the >> various databases. So I want to switch this to 'history_uuid' and use >> UUIDs instead of bigserial. >> >> Now the question; >> >> Can I tell a produce to use a specific UUID? >> >> The idea is to generate a UUID for 'history_uuid' so that I have >> consistency across databases. Of course, if an UPDATE will change >> multiple rows, then I'll need to predefine multiple UUIDs. This is where >> things start to get really complicated I think... Maybe I could pass an >> array of UUIDs? I don't care if I find out which UUID was used for which >> record, just that the same UUID was used for the same record when the >> procedure is (re)run on other DBs. >> >> The databases are not clustered, on purpose. I've been trying to >> handle all the HA stuff in my application for various reasons. >> >> If it helps, here is an example pair of tables, the procedure and the >> trigger I currently use; >> >> ==== >> CREATE TABLE host_variable ( >> host_variable_uuid uuid not null primary key, >> host_variable_host_uuid uuid not null, >> host_variable_name text not null, >> host_variable_value text not null, >> modified_date timestamp with time zone not null >> ); >> ALTER TABLE host_variable OWNER TO admin; >> >> CREATE TABLE history.host_variable ( >> history_id bigserial, >> host_variable_uuid uuid, >> host_variable_host_uuid uuid, >> host_variable_name text, >> host_variable_value text, >> modified_date timestamp with time zone not null >> ); >> ALTER TABLE history.host_variable OWNER TO admin; >> >> CREATE FUNCTION history_host_variable() RETURNS trigger >> AS $$ >> DECLARE >> history_host_variable RECORD; >> BEGIN >> SELECT INTO history_host_variable * FROM host_variable WHERE >> host_uuid = new.host_uuid; >> INSERT INTO history.host_variable >> (host_variable_uuid, >> host_variable_host_uuid, >> host_variable_name, >> host_variable_value, >> modified_date) >> VALUES >> (history_host_variable.host_variable_uuid, >> history_host_variable.host_variable_host_uuid, >> history_host_variable.host_variable_name, >> history_host_variable.host_variable_value, >> history_host_variable.modified_date); >> RETURN NULL; >> END; >> $$ >> LANGUAGE plpgsql; >> ALTER FUNCTION history_host_variable() OWNER TO admin; >> >> CREATE TRIGGER trigger_host_variable >> AFTER INSERT OR UPDATE ON host_variable >> FOR EACH ROW EXECUTE PROCEDURE history_host_variable(); >> ==== >> >> I know this might sound odd, but I didn't want to complicate things >> with how my system works. However, if it would help solve the problem, >> I'm happy to dig into more detail. >> >> Thanks! > I think James has probably given you the input you need - basically, > don't allow the system to automatically set the modified time - make > that parameter to your function or set that value before the copy to the > history tables - content would then be the same, so uuid v3 should work. > > However, I do think you have another big problem lurking in the > shadows. What happens if any of your connected databases are unavailable > or unreachable for a period of time? I suspect your going to run into > update anomalies and depending on your setup/environment, possibly even > partitioning problems (depending on number of clients and typology > etc). These are well known problems in distributed or replication > systems. > > You appear to be implementing a 'poor mans' replication system. There > are lots of complex issues to deal with and I wonder why you want to > take them on when PG has already got well tested and robust solutions > for this that would simplify your architecture and avoid the need to > re-implement functionality which already exists? > > regards, > > Tim > Hi Tim, Last I checked, pgsql couldn't handle this; Two DBs up, getting data. DB1 goes down, DB2 continues to collect data. DB2 goes down DB1 comes back up, starts collecting data. DB2 comes back up, now I need to move data in both directions (DB1 has data 2 doesn't and vice-versa). I've created a way to resolve this in my application and it's worked for some time (obviously, in my application only. It's not a general purpose system nor is it intended to be). For the record, I realized I was looking for a complex solution to a simple problem. I do create the 'modified_date' value in my app, and I just needed to refresh it between UPDATEs/INSERTs on the same column so that no two records in the history table have the same 'modified_date'. With that, my resync works again. Cheers, digimer
pgsql-general by date: