I agree with that person. Here is an implementation.
Basically, add a UNIQUE array column of all chem_ids to table mixtures and keep it current with triggers. Arrays must be sorted consistently, I use the additional module intarray for that and to optimize performance.
For lack of definition I assume frequent multi-row writes, making it a perfect candidate for transition tables introduced with Postgres 10. See:
Important note: Technically, this works in Postgres 10. But while testing I ran into a bug of intarray functions that seemed oddly familiar: empty arrays would not compare equal due to incorrect internal array dimensions. Tom Lane found and fixed this for Postgres 11, but it was not backported to Postgres 10. I strongly advise to use Postgres 11 with this.
Turns out to be another instance of a bug I reported myself earlier. See here and here. Took me a while to reproduce and get the full picture.
This uses a variety of advanced features. Not recommended for beginners.
Code
CREATE TABLE chemicals (
chem_id serial UNIQUE NOT NULL
, name text PRIMARY KEY
);
CREATE TABLE mixtures (
mixture_id serial PRIMARY KEY
, chem_ids int[] UNIQUE -- default NULL !
);
CREATE TABLE mixture_chems (
mixture_id int REFERENCES mixtures (mixture_id)
, chem_id int REFERENCES chemicals (chem_id)
);
INSERT trigger
CREATE OR REPLACE FUNCTION trg_mixture_chems_insaft()
RETURNS trigger AS
$func$
BEGIN
UPDATE mixtures AS m
SET chem_ids = sort(COALESCE(m.chem_ids, '{}') + n.chem_ids)
FROM (
SELECT mixture_id, array_agg(chem_id) AS chem_ids
FROM new_table
GROUP BY 1
) n
WHERE m.mixture_id = n.mixture_id;
RETURN NULL;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER mixture_chems_insaft
AFTER INSERT ON mixture_chems
REFERENCING NEW TABLE AS new_table
FOR EACH STATEMENT
EXECUTE PROCEDURE trg_mixture_chems_insaft();
UPDATE trigger
CREATE OR REPLACE FUNCTION trg_mixture_chems_upaft()
RETURNS trigger AS
$func$
BEGIN
UPDATE mixtures AS m
SET chem_ids = sort(COALESCE(m.chem_ids, '{}')
- COALESCE(o.chem_ids, '{}')
+ COALESCE(n.chem_ids, '{}'))
FROM (
SELECT mixture_id, array_agg(chem_id) AS chem_ids
FROM new_table
GROUP BY 1
) n
FULL JOIN (
SELECT mixture_id, array_agg(chem_id) AS chem_ids
FROM old_table
GROUP BY 1
) o USING (mixture_id)
WHERE m.mixture_id = COALESCE(n.mixture_id, o.mixture_id)
AND m.chem_ids IS DISTINCT FROM sort(COALESCE(m.chem_ids, '{}')
- COALESCE(o.chem_ids, '{}')
+ COALESCE(n.chem_ids, '{}'));
RETURN NULL;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER mixture_chems_upaft
AFTER UPDATE ON mixture_chems
REFERENCING NEW TABLE AS new_table
OLD TABLE AS old_table
FOR EACH STATEMENT
EXECUTE PROCEDURE trg_mixture_chems_upaft();
DELETE trigger
CREATE OR REPLACE FUNCTION trg_mixture_chems_delaft()
RETURNS trigger AS
$func$
BEGIN
UPDATE mixtures AS m
SET chem_ids = m.chem_ids - o.chem_ids -- assuming this does not upset sort order!
FROM (
SELECT mixture_id, array_agg(chem_id) AS chem_ids
FROM old_table
GROUP BY 1
) o
WHERE m.mixture_id = o.mixture_id
AND m.chem_ids IS DISTINCT FROM (m.chem_ids - o.chem_ids);
RETURN NULL;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER mixture_chems_delaft
AFTER DELETE ON mixture_chems
REFERENCING OLD TABLE AS old_table
FOR EACH STATEMENT
EXECUTE PROCEDURE trg_mixture_chems_delaft();
db<>fiddle here
This implementation is strict: a mixture with no chemicals (chem_ids = '{}') is just another case that is only allowed once. You may want to allow that multiple times instead. This state is only reached after deleting all existing components, newly inserted row in mixtures start out with chem_ids IS NULL to dodge this UNIQUE constraint.
And you may want to add a PRIMARY KEY constraint to disallow adding the same chemical to a mixture multiple times:
CREATE TABLE mixture_chems (
mixture_id INTEGER REFERENCES mixtures (mixture_id)
, chem_id INTEGER REFERENCES chemicals (chem_id)
, PRIMARY KEY (mixture_id, chem_id)
);
But my implementation works either way.
Related: