pub struct Utf8Error { /* fields omitted */ }
Errors which can occur when attempting to interpret a sequence of u8
as a string.
As such, the from_utf8
family of functions and methods for both String
s
and &str
s make use of this error, for example.
This error type’s methods can be used to create functionality
similar to String::from_utf8_lossy
without allocating heap memory:
fn from_utf8_lossy<F>(mut input: &[u8], mut push: F) where F: FnMut(&str) {
loop {
match ::std::str::from_utf8(input) {
Ok(valid) => {
push(valid);
break
}
Err(error) => {
let (valid, after_valid) = input.split_at(error.valid_up_to());
unsafe {
push(::std::str::from_utf8_unchecked(valid))
}
push("\u{FFFD}");
if let Some(invalid_sequence_length) = error.error_len() {
input = &after_valid[invalid_sequence_length..]
} else {
break
}
}
}
}
}
Returns the index in the given string up to which valid UTF-8 was
verified.
It is the maximum index such that from_utf8(&input[..index])
would return Ok(_)
.
Basic usage:
use std::str;
let sparkle_heart = vec![0, 159, 146, 150];
let error = str::from_utf8(&sparkle_heart).unwrap_err();
assert_eq!(1, error.valid_up_to());
Provide more information about the failure:
-
None
: the end of the input was reached unexpectedly.
self.valid_up_to()
is 1 to 3 bytes from the end of the input.
If a byte stream (such as a file or a network socket) is being decoded incrementally,
this could be a valid char
whose UTF-8 byte sequence is spanning multiple chunks.
-
Some(len)
: an unexpected byte was encountered.
The length provided is that of the invalid byte sequence
that starts at the index given by valid_up_to()
.
Decoding should resume after that sequence
(after inserting a U+FFFD REPLACEMENT CHARACTER) in case of lossy decoding.
Formats the value using the given formatter. Read more
This method tests for self
and other
values to be equal, and is used by ==
. Read more
This method tests for !=
.
Formats the value using the given formatter. Read more
Performs copy-assignment from source
. Read more