CodeRay is a Ruby library for syntax highlighting.
I try to make CodeRay easy to use and intuitive, but at the same time fully featured, complete, fast and efficient.
See README.
It consists mainly of
the main engine: CodeRay (Scanners::Scanner, Tokens/TokenStream, Encoders::Encoder), PluginHost
the scanners in CodeRay::Scanners
the encoders in CodeRay::Encoders
Here’s a fancy graphic to light up this gray docu:
See CodeRay, Encoders, Scanners, Tokens.
Remember you need RubyGems to use CodeRay, unless you have it in your load path. Run Ruby with -rubygems option if required.
require 'coderay' print CodeRay.scan('puts "Hello, world!"', :ruby).html # prints something like this: puts <span class="s">"Hello, world!"</span>
require 'coderay' print CodeRay.scan(File.read('ruby.h'), :c).div print CodeRay.scan_file('ruby.h').html.div
You can include this div in your page. The used CSS styles can be printed with
% coderay_stylesheet
If you are one of the hasty (or lazy, or extremely curious) people, just run this file:
% ruby -rubygems /path/to/coderay/coderay.rb > example.html
and look at the file it created in your browser.
The CodeRay module provides convenience methods for the engine.
The lang and format arguments select Scanner and Encoder to use. These are simply lower-case symbols, like :python or :html.
All methods take an optional hash as last parameter, options, that is send to the Encoder / Scanner.
Input and language are always sorted in this order: code, lang. (This is in alphabetical order, if you need a mnemonic ;)
You should be able to highlight everything you want just using these methods; so there is no need to dive into CodeRay’s deep class hierarchy.
The examples in the demo directory demonstrate common cases using this interface.
Read this to get a general view what CodeRay provides.
Scanning means analysing an input string, splitting it up into Tokens. Each Token knows about what type it is: string, comment, class name, etc. Each +lang+ (language) has its own Scanner; for example, <tt>:ruby</tt> code is handled by CodeRay::Scanners::Ruby.
CodeRay.scan | Scan a string in a given language into Tokens. This is the most common method to use. |
CodeRay.scan_file | Scan a file and guess the language using FileType. |
The Tokens object you get from these methods can encode itself; see Tokens.
Encoding means compiling Tokens into an output. This can be colored HTML or LaTeX, a textual statistic or just the number of non-whitespace tokens.
Each Encoder provides output in a specific format, so you select Encoders via formats like :html or :statistic.
CodeRay.encode | Scan and encode a string in a given language. |
CodeRay.encode_tokens | Encode the given tokens. |
CodeRay.encode_file | Scan a file, guess the language using FileType and encode it. |
Streaming saves RAM by running Scanner and Encoder in some sort of pipe mode; see TokenStream.
CodeRay.scan_stream | Scan in stream mode. |
CodeRay.encode | Highlight a string with a given input and output format. |
You can use an Encoder instance to highlight multiple inputs. This way, the setup for this Encoder must only be done once.
CodeRay.encoder | Create an Encoder instance with format and options. |
CodeRay.scanner | Create an Scanner instance for lang, with ’’ as default code. |
To make use of CodeRay.scanner, use CodeRay::Scanner::code=.
The scanning methods provide more flexibility; we recommend to use these.
If you want to re-use scanners and encoders (because that is faster), see CodeRay::Duo for the most convenient (and recommended) interface.