mirror of https://github.com/stenzek/duckstation
				
				
				
			
			You cannot select more than 25 topics
			Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
		
		
		
		
		
			
		
			
	
	
		
			248 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Markdown
		
	
		
		
			
		
	
	
			248 lines
		
	
	
		
			11 KiB
		
	
	
	
		
			Markdown
		
	
| 
								 
											3 years ago
										 
									 | 
							
								## fast_float number parsing library: 4x faster than strtod
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The fast_float library provides fast header-only implementations for the C++ from_chars
							 | 
						||
| 
								 | 
							
								functions for `float` and `double` types.  These functions convert ASCII strings representing
							 | 
						||
| 
								 | 
							
								decimal values (e.g., `1.3e10`) into binary types. We provide exact rounding (including
							 | 
						||
| 
								 | 
							
								round to even). In our experience, these `fast_float` functions many times faster than comparable number-parsing functions from existing C++ standard libraries.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Specifically, `fast_float` provides the following two functions with a C++17-like syntax (the library itself only requires C++11):
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```C++
							 | 
						||
| 
								 | 
							
								from_chars_result from_chars(const char* first, const char* last, float& value, ...);
							 | 
						||
| 
								 | 
							
								from_chars_result from_chars(const char* first, const char* last, double& value, ...);
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The return type (`from_chars_result`) is defined as the struct:
							 | 
						||
| 
								 | 
							
								```C++
							 | 
						||
| 
								 | 
							
								struct from_chars_result {
							 | 
						||
| 
								 | 
							
								    const char* ptr;
							 | 
						||
| 
								 | 
							
								    std::errc ec;
							 | 
						||
| 
								 | 
							
								};
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								It parses the character sequence [first,last) for a number. It parses floating-point numbers expecting
							 | 
						||
| 
								 | 
							
								a locale-independent format equivalent to the C++17 from_chars function.
							 | 
						||
| 
								 | 
							
								The resulting floating-point value is the closest floating-point values (using either float or double),
							 | 
						||
| 
								 | 
							
								using the "round to even" convention for values that would otherwise fall right in-between two values.
							 | 
						||
| 
								 | 
							
								That is, we provide exact parsing according to the IEEE standard.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Given a successful parse, the pointer (`ptr`) in the returned value is set to point right after the
							 | 
						||
| 
								 | 
							
								parsed number, and the `value` referenced is set to the parsed value. In case of error, the returned
							 | 
						||
| 
								 | 
							
								`ec` contains a representative error, otherwise the default (`std::errc()`) value is stored.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The implementation does not throw and does not allocate memory (e.g., with `new` or `malloc`).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								It will parse infinity and nan values.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Example:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								``` C++
							 | 
						||
| 
								 | 
							
								#include "fast_float/fast_float.h"
							 | 
						||
| 
								 | 
							
								#include <iostream>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								int main() {
							 | 
						||
| 
								 | 
							
								    const std::string input =  "3.1416 xyz ";
							 | 
						||
| 
								 | 
							
								    double result;
							 | 
						||
| 
								 | 
							
								    auto answer = fast_float::from_chars(input.data(), input.data()+input.size(), result);
							 | 
						||
| 
								 | 
							
								    if(answer.ec != std::errc()) { std::cerr << "parsing failure\n"; return EXIT_FAILURE; }
							 | 
						||
| 
								 | 
							
								    std::cout << "parsed the number " << result << std::endl;
							 | 
						||
| 
								 | 
							
								    return EXIT_SUCCESS;
							 | 
						||
| 
								 | 
							
								}
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Like the C++17 standard, the `fast_float::from_chars` functions take an optional last argument of
							 | 
						||
| 
								 | 
							
								the type `fast_float::chars_format`. It is a bitset value: we check whether
							 | 
						||
| 
								 | 
							
								`fmt & fast_float::chars_format::fixed` and `fmt & fast_float::chars_format::scientific` are set
							 | 
						||
| 
								 | 
							
								to determine whether we allow the fixed point and scientific notation respectively.
							 | 
						||
| 
								 | 
							
								The default is  `fast_float::chars_format::general` which allows both `fixed` and `scientific`.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The library seeks to follow the C++17 (see [20.19.3](http://eel.is/c++draft/charconv.from.chars).(7.1))  specification.
							 | 
						||
| 
								 | 
							
								* The `from_chars` function does not skip leading white-space characters.
							 | 
						||
| 
								 | 
							
								* [A leading `+` sign](https://en.cppreference.com/w/cpp/utility/from_chars) is forbidden.
							 | 
						||
| 
								 | 
							
								* It is generally impossible to represent a decimal value exactly as binary floating-point number (`float` and `double` types). We seek the nearest value. We round to an even mantissa when we are in-between two binary floating-point numbers.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Furthermore, we have the following restrictions:
							 | 
						||
| 
								 | 
							
								* We only support `float` and `double` types at this time.
							 | 
						||
| 
								 | 
							
								* We only support the decimal format: we do not support hexadecimal strings.
							 | 
						||
| 
								 | 
							
								* For values that are either very large or very small (e.g., `1e9999`), we represent it using the infinity or negative infinity value.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								We support Visual Studio, macOS, Linux, freeBSD. We support big and little endian. We support 32-bit and 64-bit systems.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								We assume that the rounding mode is set to nearest (`std::fegetround() == FE_TONEAREST`).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Using commas as decimal separator
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The C++ standard stipulate that `from_chars` has to be locale-independent. In
							 | 
						||
| 
								 | 
							
								particular, the decimal separator has to be the period (`.`). However,
							 | 
						||
| 
								 | 
							
								some users still want to use the `fast_float` library with in a locale-dependent
							 | 
						||
| 
								 | 
							
								manner. Using a separate function called `from_chars_advanced`, we allow the users
							 | 
						||
| 
								 | 
							
								to pass a `parse_options` instance which contains a custom decimal separator (e.g.,
							 | 
						||
| 
								 | 
							
								the comma). You may use it as follows.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```C++
							 | 
						||
| 
								 | 
							
								#include "fast_float/fast_float.h"
							 | 
						||
| 
								 | 
							
								#include <iostream>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								int main() {
							 | 
						||
| 
								 | 
							
								    const std::string input =  "3,1416 xyz ";
							 | 
						||
| 
								 | 
							
								    double result;
							 | 
						||
| 
								 | 
							
								    fast_float::parse_options options{fast_float::chars_format::general, ','};
							 | 
						||
| 
								 | 
							
								    auto answer = fast_float::from_chars_advanced(input.data(), input.data()+input.size(), result, options);
							 | 
						||
| 
								 | 
							
								    if((answer.ec != std::errc()) || ((result != 3.1416))) { std::cerr << "parsing failure\n"; return EXIT_FAILURE; }
							 | 
						||
| 
								 | 
							
								    std::cout << "parsed the number " << result << std::endl;
							 | 
						||
| 
								 | 
							
								    return EXIT_SUCCESS;
							 | 
						||
| 
								 | 
							
								}
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								You can parse delimited numbers:
							 | 
						||
| 
								 | 
							
								```C++
							 | 
						||
| 
								 | 
							
								  const std::string input =   "234532.3426362,7869234.9823,324562.645";
							 | 
						||
| 
								 | 
							
								  double result;
							 | 
						||
| 
								 | 
							
								  auto answer = fast_float::from_chars(input.data(), input.data()+input.size(), result);
							 | 
						||
| 
								 | 
							
								  if(answer.ec != std::errc()) {
							 | 
						||
| 
								 | 
							
								    // check error
							 | 
						||
| 
								 | 
							
								  }
							 | 
						||
| 
								 | 
							
								  // we have result == 234532.3426362.
							 | 
						||
| 
								 | 
							
								  if(answer.ptr[0] != ',') {
							 | 
						||
| 
								 | 
							
								    // unexpected delimiter
							 | 
						||
| 
								 | 
							
								  }
							 | 
						||
| 
								 | 
							
								  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), result);
							 | 
						||
| 
								 | 
							
								  if(answer.ec != std::errc()) {
							 | 
						||
| 
								 | 
							
								    // check error
							 | 
						||
| 
								 | 
							
								  }
							 | 
						||
| 
								 | 
							
								  // we have result == 7869234.9823.
							 | 
						||
| 
								 | 
							
								  if(answer.ptr[0] != ',') {
							 | 
						||
| 
								 | 
							
								    // unexpected delimiter
							 | 
						||
| 
								 | 
							
								  }
							 | 
						||
| 
								 | 
							
								  answer = fast_float::from_chars(answer.ptr + 1, input.data()+input.size(), result);
							 | 
						||
| 
								 | 
							
								  if(answer.ec != std::errc()) {
							 | 
						||
| 
								 | 
							
								    // check error
							 | 
						||
| 
								 | 
							
								  }
							 | 
						||
| 
								 | 
							
								  // we have result == 324562.645.
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Relation With Other Work
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The fast_float library is part of:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- GCC (as of version 12): the `from_chars` function in GCC relies on fast_float.
							 | 
						||
| 
								 | 
							
								- [WebKit](https://github.com/WebKit/WebKit), the engine behind Safari (Apple's web browser)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The fastfloat algorithm is part of the [LLVM standard libraries](https://github.com/llvm/llvm-project/commit/87c016078ad72c46505461e4ff8bfa04819fe7ba).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								There is a [derived implementation part of AdaCore](https://github.com/AdaCore/VSS).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The fast_float library provides a performance similar to that of the [fast_double_parser](https://github.com/lemire/fast_double_parser) library but using an updated algorithm reworked from the ground up, and while offering an API more in line with the expectations of C++ programmers. The fast_double_parser library is part of the [Microsoft LightGBM machine-learning framework](https://github.com/microsoft/LightGBM).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Reference
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- Daniel Lemire, [Number Parsing at a Gigabyte per Second](https://arxiv.org/abs/2101.11408), Software: Practice and Experience 51 (8), 2021.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Other programming languages
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- [There is an R binding](https://github.com/eddelbuettel/rcppfastfloat) called `rcppfastfloat`.
							 | 
						||
| 
								 | 
							
								- [There is a Rust port of the fast_float library](https://github.com/aldanor/fast-float-rust/) called `fast-float-rust`.
							 | 
						||
| 
								 | 
							
								- [There is a Java port of the fast_float library](https://github.com/wrandelshofer/FastDoubleParser) called `FastDoubleParser`. It used for important systems such as [Jackson](https://github.com/FasterXML/jackson-core).
							 | 
						||
| 
								 | 
							
								- [There is a C# port of the fast_float library](https://github.com/CarlVerret/csFastFloat) called `csFastFloat`.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Users
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The fast_float library is used by [Apache Arrow](https://github.com/apache/arrow/pull/8494) where it multiplied the number parsing speed by two or three times. It is also used by [Yandex ClickHouse](https://github.com/ClickHouse/ClickHouse) and by [Google Jsonnet](https://github.com/google/jsonnet).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## How fast is it?
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								It can parse random floating-point numbers at a speed of 1 GB/s on some systems. We find that it is often twice as fast as the best available competitor, and many times faster than many standard-library implementations.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<img src="http://lemire.me/blog/wp-content/uploads/2020/11/fastfloat_speed.png" width="400">
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								$ ./build/benchmarks/benchmark
							 | 
						||
| 
								 | 
							
								# parsing random integers in the range [0,1)
							 | 
						||
| 
								 | 
							
								volume = 2.09808 MB
							 | 
						||
| 
								 | 
							
								netlib                                  :   271.18 MB/s (+/- 1.2 %)    12.93 Mfloat/s 
							 | 
						||
| 
								 | 
							
								doubleconversion                        :   225.35 MB/s (+/- 1.2 %)    10.74 Mfloat/s 
							 | 
						||
| 
								 | 
							
								strtod                                  :   190.94 MB/s (+/- 1.6 %)     9.10 Mfloat/s 
							 | 
						||
| 
								 | 
							
								abseil                                  :   430.45 MB/s (+/- 2.2 %)    20.52 Mfloat/s 
							 | 
						||
| 
								 | 
							
								fastfloat                               :  1042.38 MB/s (+/- 9.9 %)    49.68 Mfloat/s 
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								See https://github.com/lemire/simple_fastfloat_benchmark for our benchmarking code.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Video
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								[](http://www.youtube.com/watch?v=AVXgvlMeIm4)<br />
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Using as a CMake dependency
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								This library is header-only by design. The CMake file provides the `fast_float` target
							 | 
						||
| 
								 | 
							
								which is merely a pointer to the `include` directory.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If you drop the `fast_float` repository in your CMake project, you should be able to use
							 | 
						||
| 
								 | 
							
								it in this manner:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```cmake
							 | 
						||
| 
								 | 
							
								add_subdirectory(fast_float)
							 | 
						||
| 
								 | 
							
								target_link_libraries(myprogram PUBLIC fast_float)
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Or you may want to retrieve the dependency automatically if you have a sufficiently recent version of CMake (3.11 or better at least):
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```cmake
							 | 
						||
| 
								 | 
							
								FetchContent_Declare(
							 | 
						||
| 
								 | 
							
								  fast_float
							 | 
						||
| 
								 | 
							
								  GIT_REPOSITORY https://github.com/lemire/fast_float.git
							 | 
						||
| 
								 | 
							
								  GIT_TAG tags/v1.1.2
							 | 
						||
| 
								 | 
							
								  GIT_SHALLOW TRUE)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								FetchContent_MakeAvailable(fast_float)
							 | 
						||
| 
								 | 
							
								target_link_libraries(myprogram PUBLIC fast_float)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								```
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								You should change the `GIT_TAG` line so that you recover the version you wish to use.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Using as single header
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The script `script/amalgamate.py` may be used to generate a single header
							 | 
						||
| 
								 | 
							
								version of the library if so desired.
							 | 
						||
| 
								 | 
							
								Just run the script from the root directory of this repository.
							 | 
						||
| 
								 | 
							
								You can customize the license type and output file if desired as described in
							 | 
						||
| 
								 | 
							
								the command line help.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								You may directly download automatically generated single-header files:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								https://github.com/fastfloat/fast_float/releases/download/v3.4.0/fast_float.h
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Credit
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Though this work is inspired by many different people, this work benefited especially from exchanges with
							 | 
						||
| 
								 | 
							
								Michael Eisel, who motivated the original research with his key insights, and with Nigel Tao who provided
							 | 
						||
| 
								 | 
							
								invaluable feedback. Rémy Oudompheng first implemented a fast path we use in the case of long digits.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The library includes code adapted from Google Wuffs (written by Nigel Tao) which was originally published
							 | 
						||
| 
								 | 
							
								under the Apache 2.0 license.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## License
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<sup>
							 | 
						||
| 
								 | 
							
								Licensed under either of <a href="LICENSE-APACHE">Apache License, Version
							 | 
						||
| 
								 | 
							
								2.0</a> or <a href="LICENSE-MIT">MIT license</a> at your option.
							 | 
						||
| 
								 | 
							
								</sup>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<br>
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								<sub>
							 | 
						||
| 
								 | 
							
								Unless you explicitly state otherwise, any contribution intentionally submitted
							 | 
						||
| 
								 | 
							
								for inclusion in this repository by you, as defined in the Apache-2.0 license,
							 | 
						||
| 
								 | 
							
								shall be dual licensed as above, without any additional terms or conditions.
							 | 
						||
| 
								 | 
							
								</sub>
							 |