Skip to content

boostorg/tokenizer

Repository files navigation

Boost.Tokenizer is a part of Boost C++ Libraries. The Boost.Tokenizer package provides a flexible and easy-to-use way to break a string or other character sequence into a series of tokens.

License

Distributed under the Boost Software License, Version 1.0.

Properties

  • C++03
  • Header-Only

Build Status

Branch GHA CI Appveyor Coverity Scan codecov.io Deps Docs Tests
master Build Status Build status Coverity Scan Build Status codecov Deps Documentation Enter the Matrix
develop Build Status Build status Coverity Scan Build Status codecov Deps Documentation Enter the Matrix

Overview

break up a phrase into words.

![Try it online][badge.wandbox]

#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>

int main(){
    std::string s = "This is,  a test";
    typedef boost::tokenizer<> Tok;
    Tok tok(s);
    for (Tok::iterator beg = tok.begin(); beg != tok.end(); ++beg){
        std::cout << *beg << "\n";
    }
}

Using Range-based for loop (C++11 or later)

![Try it online][badge.wandbox]

#include <iostream>
#include <boost/tokenizer.hpp>
#include <string>

int main(){
    std::string s = "This is,  a test";
    boost::tokenizer<> tok(s);
    for (auto token: tok) {
        std::cout << token << "\n";
    }
}

Related Material

Boost.Tokenizer Chapter 10 at theboostcpplibraries.com, contains several examples including escaped_list_separator.

Acknowledgements

From the author:

I wish to thank the members of the boost mailing list, whose comments, compliments, and criticisms during both the development and formal review helped make the Tokenizer library what it is. I especially wish to thank Aleksey Gurtovoy for the idea of using a pair of iterators to specify the input, instead of a string. I also wish to thank Jeremy Siek for his idea of providing a container interface for the token iterators and for simplifying the template parameters for the TokenizerFunctions. He and Daryle Walker also emphasized the need to separate interface and implementation. Gary Powell sparked the idea of using the isspace and ispunct as the defaults for char_delimiters_separator. Jeff Garland provided ideas on how to change to order of the template parameters in order to make tokenizer easier to declare. Thanks to Douglas Gregor who served as review manager and provided many insights both on the boost list and in e-mail on how to polish up the implementation and presentation of Tokenizer. Finally, thanks to Beman Dawes who integrated the final version into the boost distribution.

Directories

Name Purpose
example examples
include header
test unit tests

More information

  • Ask questions
  • Report bugs: Be sure to mention Boost version, platform and compiler you're using. A small compilable code sample to reproduce the problem is always good as well.
  • Submit your patches as pull requests against develop branch. Note that by submitting patches you agree to license your modifications under the Boost Software License, Version 1.0.
  • Discussions about the library are held on the Boost developers mailing list. Be sure to read the discussion policy before posting and add the [tokenizer] tag at the beginning of the subject line.