user0/rust - Forgejo: Beyond coding. We Forge.

user0/rust

Author	SHA1	Message	Date
Arpad Borsos	488598c183	Add a lower bound check to `unicode-table-generator` output This adds a dedicated check for the lower bound (if it is outside of ASCII range) to the output of the `unicode-table-generator` tool. This generalized the ASCII-only fast-path, but only for the `Grapheme_Extend` property for now, as that is the only one with a lower bound outside of ASCII.	2024-04-20 10:16:45 +02:00
Marcondiro	e9870b5df3	Bump Unicode printables to version 15.1, align to unicode_data	2024-03-28 11:21:52 +01:00
Marcondiro	01fa7209d5	Bump Unicode to version 15.1.0, regenerate tables	2024-02-09 17:35:46 +01:00
Trevor Gross	22d00dcd47	Apply changes to fix python linting errors	2023-06-16 20:56:01 -04:00
Martin Gammelsæter	54f55efb9a	Use hex literal for INDEX_MASK	2023-03-21 09:59:47 +01:00
Martin Gammelsæter	355e1dda1d	Improve case mapping encoding scheme The indices are encoded as `u32`s in the range of invalid `char`s, so that we know that if any mapping fails to parse as a `char` we should use the value for lookup in the multi-table. This avoids the second binary search in cases where a multi-`char` mapping is needed. Idea from @nikic	2023-03-16 21:42:15 +01:00
Martin Gammelsæter	f9bd884385	Split unicode case LUTs in single and multi variants The majority of char case replacements are single char replacements, so storing them as [char; 3] wastes a lot of space. This commit splits the replacement tables for both `to_lower` and `to_upper` into two separate tables, one with single-character mappings and one with multi-character mappings. This reduces the binary size for programs using all of these tables with roughly 24K bytes.	2023-03-16 12:34:04 +01:00
Martin Gammelsæter	8a4eb9e3a8	Skip serializing ascii chars in case LUTs Since ascii chars are already handled by a special case in the `to_lower` and `to_upper` functions, there's no need to waste space on them in the LUTs.	2023-03-15 17:27:23 +01:00
jonathanCogan	db47071df2	Replace libstd, libcore, liballoc in line comments.	2022-12-30 14:00:42 +01:00
Thom Chiovoloni	ac55092a14	Bump Unicode to version 15.0.0, regenerate tables	2022-09-14 13:21:19 -07:00
Sage Mitchell	2b328ea5ee	Address feedback from PR #101401	2022-09-04 08:07:53 -07:00
Sage Mitchell	4a3e169da7	Make `char::is_lowercase` and `char::is_uppercase` const Implements #101400.	2022-09-04 08:07:53 -07:00
Bruce A. MacNaughton	5d048eb69d	add #inline	2022-07-20 16:13:54 -07:00
Bruce A. MacNaughton	e5d4de3912	generated code	2022-07-19 18:03:33 -07:00
Nilstrieb	3358a41acb	Add unicode fast path to `is_printable` Before, it would enter the full expensive check even for normal ascii characters. Now, it skips the check for the ascii characters in `32..127`. This range was checked manually from the current behavior.	2022-05-31 10:51:35 +02:00
Josh Stone	459a7e340c	Regenerate tables for Unicode 14.0.0	2021-10-06 17:49:33 -07:00
Smitty	bdfcb88e8b	Use HTTPS links where possible	2021-06-23 16:26:46 -04:00
Miccah Castorina	e48c68479e	Add a check for ASCII characters in to_upper and to_lower This extra check has better performance. See discussion here: https://internals.rust-lang.org/t/to-upper-speed/13896	2021-02-26 11:39:36 -06:00
Aleksey Kladov	88da5682c3	Privatize some of libcore unicode_internals My understanding is that these API are perma unstable, so it doesn't make sense to pollute docs & IDE completion[1] with them. [1]: https://github.com/rust-analyzer/rust-analyzer/issues/6738	2020-12-07 16:16:42 +03:00
mark	2c31b45ae8	mv std libs to library/	2020-07-27 19:51:13 -05:00

20 commits