Commit 60618b0d authored by Guillaume Lazzara's avatar Guillaume Lazzara
Browse files

Add standard support for OCR output in PAGE format.

	* scribo/io/xml/internal/page_xml_visitor.hh: Here.
parent 609f8023
2013-03-07 Guillaume Lazzara <z@lrde.epita.fr>
Add standard support for OCR output in PAGE format.
* scribo/io/xml/internal/page_xml_visitor.hh: Here.
2013-03-07 Guillaume Lazzara <z@lrde.epita.fr>
Fix sauvola_ms test.
......
// Copyright (C) 2011 EPITA Research and Development Laboratory (LRDE)
// Copyright (C) 2011, 2013 EPITA Research and Development Laboratory
// (LRDE)
//
// This file is part of Olena.
//
......@@ -268,14 +269,26 @@ namespace scribo
<< "\">"
<< std::endl;
// Add support for text recognition
// <TextEquiv>
// <PlainText></PlainText>
// <Unicode></Unicode>
// </TextEquiv>
// Save coordinates.
internal::print_image_coords(output, par, " ");
// Save text recognition results.
output << "<TextEquiv>" << std::endl
<< "<PlainText></PlainText>" << std::endl;
output << "<Unicode>";
// Retrieve and merge text from paragraph lines.
for_all_paragraph_lines(lid, line_ids)
{
line_id_t l = line_ids(lid);
if (lines(l).has_text())
output << lines(l).html_text() << std::endl;
}
output << "</Unicode>" << std::endl
<< "</TextEquiv>" << std::endl;
output << " </TextRegion>" << std::endl;
}
}
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment