Generating a Short Random ID in MySQL

From time to time, I have had to go back to a legacy database and for one reason or another add a non-integer surrogate key. Sometimes this has come up when working on an old API that exposes a primary key in the URL (GET /account/123). This generally can be bad for security, as a user could attempt to access other resources by incrementing the ID (GET /account/124).

Generating a random string as a surrogate key in code can be easy enough, by using a UUID or a class like RandomStringUtils (careful, look up the Scunthorpe problem). But if you have lots of rows in a database that need to be updated, you will likely end up backfilling values in the database itself.

With Postgres 13, this is fairly simple:

select gen_random_uuid()::varchar
from some_table
UUID
7fc2113c-d471-4138-8e8e-b8aba6bcf3fc
b14bdeec-81eb-466f-b5b0-a2440e05b667
645aa4d2-4c3e-48ee-adb5-2d28f2c33ec9

MySQL's UUID algorithm is unfortunately a bit more predictable. It generates values that are sequential or near-sequential when run on the same database server at approximately the same time. Only the last characters of the first grouping will change. So, if you do a substring and select 6 characters, or only select the last 12 characters, you will end up with duplicate values. The same is true for the uuid_short function - the values will be sequential.

Some possible solutions in MySQL is to either run the UUID through a hash like sha, or use random_bytes. Just be sure to chose enough bytes to provide room for sufficient randomness to reduce the chance of collisions, and so that the probability of sequential IDs are low (ie: AABBCC, AABBCD). With hex (0-9,A-F), 6 characters allows for 16 million unique combinations, 8 over 4 billion, and 10 characters over 1 trillion.

select UUID(),
       uuid_short(),
       substr(sha(UUID()) from 1 for 6),
       hex(random_bytes(3))
from some_table
UUIDUUID_SHORTSHARANDOM_BYTES
d54aced3-cfd3-11ec-afbc-0242ac130002277177057966817432f18bd43E6A4
d54ad161-cfd3-11ec-afbc-0242ac1300022771770579668174402b27c7BD5CD
d54ad645-cfd3-11ec-afbc-0242ac1300022771770579668174585620e95835B
d54ad7d7-cfd3-11ec-afbc-0242ac130002277177057966817460c2a7b26D02E

Just keep in mind that relying on security by obscurity, by having a long identifier in a URL that would be hard to guess, is not the only solution you should rely on. Sensitive or protected resources should still be protected by good authentication and authorization.

Leave a Reply