courts.tjto.client.TJTOScraper

courts.tjto.client.TJTOScraper()

Scraper for the Tribunal de Justica do Tocantins.

Methods

Name Description
cjpg Fetch first-instance jurisprudence from TJTO (download + parse).
cjpg_download Download raw HTML pages from the TJTO first-instance jurisprudence search.
cjpg_parse Parse raw HTML pages downloaded by cjpg_download.
cjsg Fetch second-instance jurisprudence from TJTO (download + parse).
cjsg_download Download raw HTML pages from the TJTO second-instance jurisprudence search.
cjsg_ementa Fetch the ementa for a specific document by UUID.
cjsg_parse Parse raw HTML pages downloaded by cjsg_download.
cpopg Stub: first instance case consultation not implemented for TJTO.
cposg Stub: second instance case consultation not implemented for TJTO.

cjpg

courts.tjto.client.TJTOScraper.cjpg(
    pesquisa=None,
    paginas=None,
    tipo_documento='acordaos',
    ordenacao='DESC',
    numero_processo='',
    data_julgamento_inicio=None,
    data_julgamento_fim=None,
    soementa=False,
    session=None,
    **kwargs,
)

Fetch first-instance jurisprudence from TJTO (download + parse).

Shortcut for :meth:cjpg_download + :meth:cjpg_parse. Queries only first-instance results (instancia='1'). Accepts the same parameters as :meth:cjsg.

Returns

Name Type Description
pd.DataFrame DataFrame with jurisprudence results.

cjpg_download

courts.tjto.client.TJTOScraper.cjpg_download(
    pesquisa=None,
    paginas=None,
    tipo_documento='acordaos',
    ordenacao='DESC',
    numero_processo='',
    data_julgamento_inicio=None,
    data_julgamento_fim=None,
    soementa=False,
    session=None,
    **kwargs,
)

Download raw HTML pages from the TJTO first-instance jurisprudence search.

Shortcut for the download with instancia='1'. Accepts the same parameters as :meth:cjsg_download.

Returns

Name Type Description
list List of raw HTML strings.

cjpg_parse

courts.tjto.client.TJTOScraper.cjpg_parse(resultados_brutos)

Parse raw HTML pages downloaded by cjpg_download.

Parameters

Name Type Description Default
resultados_brutos list List of raw HTML strings. required

Returns

Name Type Description
pd.DataFrame DataFrame with parsed results.

cjsg

courts.tjto.client.TJTOScraper.cjsg(
    pesquisa=None,
    paginas=None,
    tipo_documento='acordaos',
    ordenacao='DESC',
    numero_processo='',
    data_julgamento_inicio=None,
    data_julgamento_fim=None,
    soementa=False,
    session=None,
    **kwargs,
)

Fetch second-instance jurisprudence from TJTO (download + parse).

Parameters

Name Type Description Default
pesquisa Optional[str] Search term. None
paginas Union[int, list, range, None] Pages to download (1-based). int, list, range, or None (all). None
tipo_documento str ‘acordaos’, ‘decisoes’, or ‘sentencas’. 'acordaos'
ordenacao str ‘DESC’ (most recent), ‘ASC’ (oldest), ‘RELEV’ (most relevant). 'DESC'
numero_processo str Filter by process number. ''
data_julgamento_inicio Optional[str] Start date (DD/MM/YYYY). None
data_julgamento_fim Optional[str] End date (DD/MM/YYYY). None
soementa bool If True, restrict search to ementa text only. False

Returns

Name Type Description
pd.DataFrame DataFrame with jurisprudence results.

cjsg_download

courts.tjto.client.TJTOScraper.cjsg_download(
    pesquisa=None,
    paginas=None,
    tipo_documento='acordaos',
    ordenacao='DESC',
    numero_processo='',
    data_julgamento_inicio=None,
    data_julgamento_fim=None,
    soementa=False,
    session=None,
    **kwargs,
)

Download raw HTML pages from the TJTO second-instance jurisprudence search.

Parameters

Name Type Description Default
pesquisa Optional[str] Search term. None
paginas Union[int, list, range, None] Pages to download (1-based). int, list, range, or None (all). None
tipo_documento str ‘acordaos’, ‘decisoes’, or ‘sentencas’. 'acordaos'
ordenacao str ‘DESC’ (most recent), ‘ASC’ (oldest), ‘RELEV’ (most relevant). 'DESC'
numero_processo str Filter by process number. ''
data_julgamento_inicio Optional[str] Start date for judgment filter (DD/MM/YYYY). None
data_julgamento_fim Optional[str] End date for judgment filter (DD/MM/YYYY). None
soementa bool If True, restrict search to ementa text only. False

Returns

Name Type Description
list List of raw HTML strings.

cjsg_ementa

courts.tjto.client.TJTOScraper.cjsg_ementa(uuid)

Fetch the ementa for a specific document by UUID.

Parameters

Name Type Description Default
uuid str The document UUID (from the ‘uuid’ column in cjsg/cjpg results). required

Returns

Name Type Description
dict Dict with ementa text and process number.

cjsg_parse

courts.tjto.client.TJTOScraper.cjsg_parse(resultados_brutos)

Parse raw HTML pages downloaded by cjsg_download.

Parameters

Name Type Description Default
resultados_brutos list List of raw HTML strings. required

Returns

Name Type Description
pd.DataFrame DataFrame with parsed results.

cpopg

courts.tjto.client.TJTOScraper.cpopg(id_cnj)

Stub: first instance case consultation not implemented for TJTO.

cposg

courts.tjto.client.TJTOScraper.cposg(id_cnj)

Stub: second instance case consultation not implemented for TJTO.