NetSurf
Functions
save_text.c File Reference

Text export of HTML (implementation). More...

#include <assert.h>
#include <stdbool.h>
#include <string.h>
#include <dom/dom.h>
#include "utils/config.h"
#include "utils/log.h"
#include "utils/utf8.h"
#include "utils/utils.h"
#include "netsurf/content.h"
#include "html/box.h"
#include "html/html_save.h"
#include "netsurf/utf8.h"
#include "desktop/gui_internal.h"
#include "desktop/save_text.h"
Include dependency graph for save_text.c:

Go to the source code of this file.

Functions

static void extract_text (struct box *box, bool *first, save_text_whitespace *before, struct save_text_state *save)
 Traverse though the box tree and add all text to a save buffer. More...
 
static bool save_text_add_to_buffer (const char *text, size_t length, struct box *box, const char *whitespace_text, size_t whitespace_length, struct save_text_state *save)
 Add text to save text buffer. More...
 
void save_as_text (struct hlcache_handle *c, char *path)
 Extract the text from an HTML content and save it as a text file. More...
 
void save_text_solve_whitespace (struct box *box, bool *first, save_text_whitespace *before, const char **whitespace_text, size_t *whitespace_length)
 Decide what whitespace to place before the next bit of content-related text that is saved. More...
 

Detailed Description

Text export of HTML (implementation).

Definition in file save_text.c.

Function Documentation

◆ extract_text()

void extract_text ( struct box box,
bool *  first,
save_text_whitespace before,
struct save_text_state save 
)
static

Traverse though the box tree and add all text to a save buffer.

Parameters
boxPointer to box.
firstWhether this is before the first bit of content-related text to be saved.
beforeType of whitespace currently intended to be placed before the next bit of content-related text to be saved. Updated if this box is worthy of more significant whitespace.
saveour save_text_state workspace pointer
Returns
true iff the file writing succeeded and traversal should continue.

Definition at line 214 of file save_text.c.

References BOX_BR, BOX_FLOAT_LEFT, BOX_FLOAT_RIGHT, box::children, extract_text(), box::length, box::list_marker, box::next, save_text_add_to_buffer(), save_text_solve_whitespace(), box::text, box::type, and WHITESPACE_NONE.

Referenced by extract_text(), and save_as_text().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ save_as_text()

void save_as_text ( struct hlcache_handle c,
char *  path 
)

Extract the text from an HTML content and save it as a text file.

Text is converted to the local encoding.

Parameters
cAn HTML content.
pathPath to save text file too.

Definition at line 57 of file save_text.c.

References save_text_state::block, content_get_type(), CONTENT_HTML, extract_text(), guit, html_get_box_tree(), save_text_state::length, NSERROR_OK, NSLOG, path(), result, netsurf_table::utf8, gui_utf8_table::utf8_to_local, and WHITESPACE_NONE.

Referenced by ami_file_save(), plaintext_button_clicked_cb(), and ro_gui_save_content().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ save_text_add_to_buffer()

bool save_text_add_to_buffer ( const char *  text,
size_t  length,
struct box box,
const char *  whitespace_text,
size_t  whitespace_length,
struct save_text_state save 
)
static

Add text to save text buffer.

Any preceding whitespace or following space is also added to the buffer.

Parameters
textPointer to text being added.
lengthLength of text to be appended (bytes).
boxPointer to text box.
whitespace_textWhitespace to place before text for formatting may be NULL.
whitespace_lengthLength of whitespace_text.
saveOur save_text_state workspace pointer.
Returns
true iff the file writing succeeded and traversal should continue.

Definition at line 270 of file save_text.c.

References save_text_state::alloc, save_text_state::block, box::length, save_text_state::length, box::space, and text().

Referenced by extract_text().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ save_text_solve_whitespace()

void save_text_solve_whitespace ( struct box box,
bool *  first,
save_text_whitespace before,
const char **  whitespace_text,
size_t *  whitespace_length 
)

Decide what whitespace to place before the next bit of content-related text that is saved.

Any existing whitespace is overridden if the whitespace for this box is more "significant".

Parameters
boxPointer to box.
firstWhether this is before the first bit of content-related text to be saved.
beforeType of whitespace currently intended to be placed before the next bit of content-related text to be saved. Updated if this box is worthy of more significant whitespace.
whitespace_textWhitespace to place before next bit of content-related text to be saved. Updated if this box is worthy of more significant whitespace.
whitespace_lengthLength of whitespace_text. Updated if this box is worthy of more significant whitespace.

Definition at line 125 of file save_text.c.

References BOX_BLOCK, BOX_BR, BOX_FLOAT_LEFT, BOX_FLOAT_RIGHT, BOX_INLINE, BOX_INLINE_CONTAINER, BOX_TABLE, BOX_TABLE_CELL, BOX_TABLE_ROW, box::list_marker, box::parent, box::style, box::type, WHITESPACE_NONE, WHITESPACE_ONE_NEW_LINE, WHITESPACE_TAB, and WHITESPACE_TWO_NEW_LINES.

Referenced by extract_text(), and selection_copy().

Here is the caller graph for this function: