From time to time, I have had to go back to a legacy database and for one reason or another add a non-integer surrogate key. Sometimes this has come up when working on an old API that exposes a primary key in the URL (
GET /account/123). This generally can be bad for security, as a user could attempt to access other resources by incrementing the ID (
Generating a random string as a surrogate key in code can be easy enough, by using a UUID or a class like RandomStringUtils (careful, look up the Scunthorpe problem). But if you have lots of rows in a database that need to be updated, you will likely end up backfilling values in the database itself.
With Postgres 13, this is fairly simple:
select gen_random_uuid()::varchar from some_table
MySQL’s UUID algorithm is unfortunately a bit more predictable. It generates values that are sequential or near-sequential when run on the same database server at approximately the same time. Only the last characters of the first grouping will change. So, if you do a substring and select 6 characters, or only select the last 12 characters, you will end up with duplicate values. The same is true for the
uuid_short function – the values will be sequential.
Some possible solutions in MySQL is to either run the UUID through a hash like
sha, or use
random_bytes. Just be sure to chose enough bytes to provide room for sufficient randomness to reduce the chance of collisions, and so that the probability of sequential IDs are low (ie:
AABBCD). With hex (0-9,A-F), 6 characters allows for 16 million unique combinations, 8 over 4 billion, and 10 characters over 1 trillion.
select UUID(), uuid_short(), substr(sha(UUID()) from 1 for 6), hex(random_bytes(3)) from some_table
Just keep in mind that relying on security by obscurity, by having a long identifier in a URL that would be hard to guess, is not the only solution you should rely on. Sensitive or protected resources should still be protected by good authentication and authorization.