Skip to content

UUID ​

This page is about the usage of UUIDs in Flowcontrol, their different types and how to store these in an SQL database.

Why do we need UUIDs? ​

There are several reasons why UUIDs may be useful:

  • Uniqueness: UUIDs are designed to be unique across all devices and all time. This makes them useful for generating unique IDs for objects or data that need to be identified across different systems or devices.
  • Randomness: UUIDs are generated using random numbers, which makes it unlikely that two UUIDs will be the same. This makes them suitable for generating unique IDs for objects or data that need to be identified in a secure and unpredictable way.
  • Ease of use: UUIDs are easy to generate and can be easily stored as strings. They do not require a central authority or database to manage them, which makes them useful for distributed systems.
  • Portability: UUIDs are standardized and can be used in any programming language or database system. This makes them useful for sharing data and objects between different systems.
  • Security: Using UUIDs as resource IDs can help to prevent vulnerabilities such as sequential resource enumeration, where an attacker can guess the IDs of resources and potentially access sensitive data. UUIDs, being random and hard to guess, can help to prevent this type of attack.
  • Scalability: UUIDs can be generated on the client side, which means that a REST API does not need to manage the allocation of resource IDs. This can help to improve the scalability of the API, as it does not need to handle the additional workload of managing resource IDs.

Overall, UUIDs are a useful tool for generating unique IDs that can be used to identify objects and data in a variety of situations. They are widely used in computer systems to identify resources such as database records and files.

Versions ​

A universally unique identifier (UUID) is a string of characters that is guaranteed to be unique across all devices and all time. There are several versions of UUIDs, which differ in the way they are generated and the size of the resulting string.

Here are the main differences between the different versions of UUIDs:

  • UUID version 1 (UUIDv1): This version of UUID is generated using the device's MAC address and the current time. It includes a timestamp and the MAC address of the device that generated it. UUIDv1 is relatively easy to generate, but it can be traced back to the device that generated it, which may not be desirable in some cases.
  • UUID version 2 (UUIDv2): This version of UUID is similar to UUIDv1, but it is based on the POSIX UID (user ID) and GID (group ID) rather than the MAC address. It is not commonly used.
  • UUID version 3 (UUIDv3): This version of UUID is generated by hashing a namespace identifier and a name with the MD5 hash function. It is a fixed-size string of characters, with a length of 32 characters. UUIDv3 is relatively easy to generate and is suitable for generating unique IDs for objects or data that are not associated with a particular device.
  • UUID version 4 (UUIDv4): This version of UUID is generated using random numbers. It is a fixed-size string of characters, with a length of 32 characters. UUIDv4 is suitable for generating unique IDs for objects or data that are not associated with a particular device and for which uniqueness is important.
  • UUID version 5 (UUIDv5): This version of UUID is similar to UUIDv3, but it uses the SHA-1 hash function instead of MD5. It is a fixed-size string of characters, with a length of 32 characters. UUIDv5 is suitable for generating unique IDs for objects or data that are not associated with a particular device and for which uniqueness is important.

Addressing UUID Storage Issues in SQL Server Databases ​

Abstract: ​

This research paper investigates the discrepancies observed when storing UUIDs in SQL Server databases using the uniqueidentifier data type versus the binary(16) data type. We analyze the differences between these storage options and propose a VARCHAR(36) approach to mitigate these issues and provide better compatibility across different databases.

Introduction ​

UUIDs (Universally Unique Identifiers) are widely used for uniquely identifying objects in distributed systems. However, storing UUIDs in SQL Server databases has presented challenges due to discrepancies in byte order between SQL Server's uniqueidentifier data type and the standard UUID representation. This paper aims to address these discrepancies and propose a storage solution that ensures compatibility and portability across different databases.

Background ​

The uniqueidentifier data type in SQL Server stores UUIDs in a little-endian format, while the standard UUID representation follows a big-endian format. This results in UUIDs appearing different in the database when using the uniqueidentifier data type compared to the standard representation. For instance, a UUID such as 'a4a3c9ff-ea68-4ee0-8d57-1930b2ffdbfb' would be stored as 'FFC9A3A4-68EA-E04E-8D57-1930B2FFDBFB' in the database.

Analysis ​

We evaluated three primary options for storing UUIDs in SQL Server databases: VARCHAR(36), BINARY(16), and SQL Server's UNIQUEIDENTIFIER.

VARCHAR(36) ​

Pros:

  • Human-readable
  • Portable across different databases

Cons:

  • Takes up more storage space compared to binary storage
  • Slower for indexing and searching

BINARY(16) ​

Pros:

  • Efficient storage
  • Better performance for indexing and searching

Cons:

  • Not human-readable
  • Can have byte order issues, as experienced with SQL Server

SQL Server's UNIQUEIDENTIFIER ​

Pros:

  • Native support for UUIDs in SQL Server
  • Can be used with built-in functions

Cons:

  • Not portable across different databases
  • Requires custom UserType or other handling to maintain standard UUID representation in Java

Proposed Solution ​

Based on our analysis, we recommend using the VARCHAR(36) approach for storing UUIDs. This method ensures portability across different databases and maintains human-readable UUIDs. Although this solution consumes more storage space and may be slightly slower for indexing and searching compared to binary storage, it offers the benefits of compatibility and readability.

Conclusion ​

To address the UUID storage issues in SQL Server databases, we recommend using the VARCHAR(36) approach. This method provides portability across different databases and human-readable UUIDs, albeit with some trade-offs in storage space and performance. By adopting the VARCHAR(36) storage option, we can mitigate the discrepancies observed when using SQL Server's uniqueidentifier data type and ensure a more consistent experience for developers and users alike.