mirror of
https://github.com/catdevnull/preciazo.git
synced 2025-02-17 00:24:05 +00:00
*** son warcs no tar
This commit is contained in:
parent
c217161d21
commit
254a068091
1 changed files with 2 additions and 2 deletions
|
@ -22,7 +22,7 @@ empezá descargando un WARC con 50 páginas de sample, y recomprimilo con zstd:
|
|||
|
||||
```
|
||||
wget --no-verbose --tries=3 --delete-after --input-file ./data/samples/Dia.txt --warc-file=dia-sample
|
||||
gzip -dc dia-sample.warc.gz | zstd --long -15 --no-sparse -o dia-sample.tar.zst
|
||||
gzip -dc dia-sample.warc.gz | zstd --long -15 --no-sparse -o dia-sample.warc.zst
|
||||
```
|
||||
|
||||
después, scrapealo a una BD:
|
||||
|
@ -30,7 +30,7 @@ después, scrapealo a una BD:
|
|||
```
|
||||
cd scraper/
|
||||
bun install
|
||||
bun cli.ts scrap ../dia-sample.tar.zst
|
||||
bun cli.ts scrap ../dia-sample.warc.zst
|
||||
```
|
||||
|
||||
ahora miralo en el sitio:
|
||||
|
|
Loading…
Reference in a new issue