KEncodingProber Class
Provides encoding detection(probe) capabilities. More...
Header: | #include <KEncodingProber> |
CMake: | find_package(KF6 REQUIRED COMPONENTS Codecs) target_link_libraries(mytarget PRIVATE KF6::Codecs) |
Public Types
enum | ProberState { FoundIt, NotMe, Probing } |
enum | ProberType { None, Universal, Arabic, Baltic, CentralEuropean, …, WesternEuropean } |
Public Functions
KEncodingProber(KEncodingProber::ProberType proberType = Universal) | |
float | confidence() const |
(since 4.2.2) QByteArray | encoding() const |
KEncodingProber::ProberState | feed(QByteArrayView data) |
void | reset() |
void | setProberType(KEncodingProber::ProberType proberType) |
KEncodingProber::ProberState | state() const |
Static Public Members
QString | nameForProberType(KEncodingProber::ProberType proberType) |
KEncodingProber::ProberType | proberTypeForName(const QString &lang) |
Detailed Description
Probe the encoding of raw data only. In the case it can't find it, return the most possible encoding it guessed.
Always do Unicode probe regardless the ProberType
Feed data to it several times with feed() until ProberState changes to FoundIt/NotMe, or confidence() returns a value you find acceptable.
Intended lifetime of the object: one instance per ProberType.
Typical use:
QByteArray data, moredata; ... KEncodingProber prober(KEncodingProber::Chinese); prober.feed(data); prober.feed(moredata); if (prober.confidence() > 0.6) encoding = prober.encoding();
At least 256 characters are needed to change the ProberState from Probing to FoundIt. If you don't have so many characters to probe, decide whether to accept the encoding it guessed so far according to the Confidence by yourself.
Member Type Documentation
enum KEncodingProber::ProberState
Constant | Value | Description |
---|---|---|
KEncodingProber::FoundIt | 0 | Sure find the encoding |
KEncodingProber::NotMe | 1 | Sure not included in current ProberType's all supported encodings |
KEncodingProber::Probing | 2 | Need more data to make a decision |
enum KEncodingProber::ProberType
Constant | Value |
---|---|
KEncodingProber::None | 0 |
KEncodingProber::Universal | 1 |
KEncodingProber::Arabic | 2 |
KEncodingProber::Baltic | 3 |
KEncodingProber::CentralEuropean | 4 |
KEncodingProber::ChineseSimplified | 5 |
KEncodingProber::ChineseTraditional | 6 |
KEncodingProber::Cyrillic | 7 |
KEncodingProber::Greek | 8 |
KEncodingProber::Hebrew | 9 |
KEncodingProber::Japanese | 10 |
KEncodingProber::Korean | 11 |
KEncodingProber::NorthernSaami | 12 |
KEncodingProber::Other | 13 |
KEncodingProber::SouthEasternEurope | 14 |
KEncodingProber::Thai | 15 |
KEncodingProber::Turkish | 16 |
KEncodingProber::Unicode | 17 |
KEncodingProber::WesternEuropean | 18 |
Member Function Documentation
KEncodingProber::KEncodingProber(KEncodingProber::ProberType proberType = Universal)
Default ProberType is Universal(detect all possible encodings)
float KEncodingProber::confidence() const
Returns the confidence(sureness) of encoding it guessed so far (0.0 ~ 0.99), not very reliable for single byte encodings
[since 4.2.2]
QByteArray KEncodingProber::encoding() const
Returns a QByteArray with the name of the best encoding it has guessed so far
This function was introduced in 4.2.2.
KEncodingProber::ProberState KEncodingProber::feed(QByteArrayView data)
The main class method
Feed data to the prober
Returns the ProberState after probing the fed data.
[static]
QString KEncodingProber::nameForProberType(KEncodingProber::ProberType proberType)
map ProberType to language string
proberType the proper type
Returns the language string
[static]
KEncodingProber::ProberType KEncodingProber::proberTypeForName(const QString &lang)
Returns the ProberType for lang (e.g. proberTypeForName("Chinese Simplified") will return KEncodingProber::ChineseSimplified
void KEncodingProber::reset()
reset the prober's internal state and data.
void KEncodingProber::setProberType(KEncodingProber::ProberType proberType)
change current prober's ProberType and reset the prober
proberType the new type
KEncodingProber::ProberState KEncodingProber::state() const
Returns the prober's current ProberState