David Grayson
2015-08-02 03:56:23 UTC
Hello. Attached is a patch that adds a complete implementation of intsafe.h that I generated and tested using Ruby. It would be great if someone could merge it in.
The version of intsafe.h included in this patch can also be viewed here:
https://github.com/DavidEGrayson/intsafe/blob/1.1.0/generated/intsafe.h
That repository also has Ruby scripts that I used to help generate and test the header. Hopefully we can keep using those scripts to update intsafe.h in the future instead of editing it by hand. I believe using scripts to generate a highly-repetitive header like this is the best way to ensure consistency and avoid errors.
This patch also fixes the definitions of CHAR_MIN and CHAR_MAX in limits.h so that they have the right values when char is unsigned (-funsigned-char). This change is necessary to make intsafe.h work because I used limits defined in limits.h and stdint.h instead of redefining them.
Some statistics:
- The intsafe.h documentation from Microsoft defines 253 inline functions:
- 193 integer conversion functions
- 60 math operations (20 types with 3 operations each)
- This implementation of intsafe.h is only 1562 lines (6.2 lines per function)
- 134 function bodies
- 119 functions defined as a simple preprocessor macro pointing a compatible function
- This implementation of intsafe.h is generated from 693 lines of Ruby code
- Microsoft's bulky version from the Windows SDK takes 8570 lines (33.9 lines per function)
I made very light use of the preprocessor because I had Ruby at my disposal, and I think this resulted in very clear and easy to understand code. One thing that makes the code easier to check is the lack of casting in the main computations. Casting can suppress a lot of useful compiler warnings so I simply didn't do it, and I fixed the root causes of the warnings instead.
I have a giant auto-generated test suite to test implementations of intsafe.h, and I was constantly running it against all combinations of architecture (32-bit and 64-it), language (C and C++), and char type (signed and unsigned). The tests were run with the options "-Wall -Werror -pedantic -O1". I think I am aware of most of the issues regarding undefined behavior from integers and I think I avoided them all. Although the following conditions are not strictly necessary, I think my code conforms to them: no operation should ever overflow, and no value should ever get converted to a type that cannot represent it. I designed every function in intsafe.h so that it is guaranteed to write to its output parameter at least once before returning, in order to protect users from the undefined behavior of reading from an uninitialized variable.
I wasn't sure if I should use the always_inline attribute, so I didn't. Certain things you might do in C can cause undefined reference errors because no non-inline definition is provided by default. However, the header can easily be used to generate non-inline definitions if needed, either by the user or by the mingw-w64 developers. Since CHAR might be signed in one translation unit and unsigned in another, I took measures to prevent accidentally linking to the wrong version of a function that operates on a CHAR (see line 956).
All of the files I made for this project are in the public domain.
I am welcome to feedback, even if it is simple things like coding style or mingw-w64 conventions. I hope you guys will find this useful when porting MSVC projects over to mingw-w64, and I know I will. Thanks!
--David Grayson
The version of intsafe.h included in this patch can also be viewed here:
https://github.com/DavidEGrayson/intsafe/blob/1.1.0/generated/intsafe.h
That repository also has Ruby scripts that I used to help generate and test the header. Hopefully we can keep using those scripts to update intsafe.h in the future instead of editing it by hand. I believe using scripts to generate a highly-repetitive header like this is the best way to ensure consistency and avoid errors.
This patch also fixes the definitions of CHAR_MIN and CHAR_MAX in limits.h so that they have the right values when char is unsigned (-funsigned-char). This change is necessary to make intsafe.h work because I used limits defined in limits.h and stdint.h instead of redefining them.
Some statistics:
- The intsafe.h documentation from Microsoft defines 253 inline functions:
- 193 integer conversion functions
- 60 math operations (20 types with 3 operations each)
- This implementation of intsafe.h is only 1562 lines (6.2 lines per function)
- 134 function bodies
- 119 functions defined as a simple preprocessor macro pointing a compatible function
- This implementation of intsafe.h is generated from 693 lines of Ruby code
- Microsoft's bulky version from the Windows SDK takes 8570 lines (33.9 lines per function)
I made very light use of the preprocessor because I had Ruby at my disposal, and I think this resulted in very clear and easy to understand code. One thing that makes the code easier to check is the lack of casting in the main computations. Casting can suppress a lot of useful compiler warnings so I simply didn't do it, and I fixed the root causes of the warnings instead.
I have a giant auto-generated test suite to test implementations of intsafe.h, and I was constantly running it against all combinations of architecture (32-bit and 64-it), language (C and C++), and char type (signed and unsigned). The tests were run with the options "-Wall -Werror -pedantic -O1". I think I am aware of most of the issues regarding undefined behavior from integers and I think I avoided them all. Although the following conditions are not strictly necessary, I think my code conforms to them: no operation should ever overflow, and no value should ever get converted to a type that cannot represent it. I designed every function in intsafe.h so that it is guaranteed to write to its output parameter at least once before returning, in order to protect users from the undefined behavior of reading from an uninitialized variable.
I wasn't sure if I should use the always_inline attribute, so I didn't. Certain things you might do in C can cause undefined reference errors because no non-inline definition is provided by default. However, the header can easily be used to generate non-inline definitions if needed, either by the user or by the mingw-w64 developers. Since CHAR might be signed in one translation unit and unsigned in another, I took measures to prevent accidentally linking to the wrong version of a function that operates on a CHAR (see line 956).
All of the files I made for this project are in the public domain.
I am welcome to feedback, even if it is simple things like coding style or mingw-w64 conventions. I hope you guys will find this useful when porting MSVC projects over to mingw-w64, and I know I will. Thanks!
--David Grayson