awk => Manipulación de filas

Extraer líneas específicas de un archivo de texto

Supongamos que tenemos un archivo

cat -n lorem_ipsum.txt
 1    Lorem Ipsum is simply dummy text of the printing and typesetting industry.
 2    Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
 3    It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
 4    It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum

Queremos extraer las líneas 2 y 3 de este archivo.

awk 'NR==2,NR==3' lorem_ipsum.txt

Esto imprimirá las líneas 2 y 3:

 2    Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
 3    It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.

Extraer columna / campo específico de una línea específica

Si tiene el siguiente archivo de datos

cat data.csv
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50

tal vez necesitas leer la cuarta columna de la tercera línea, esto sería "24"

awk 'NR==3 { print $4 }' data.csv

24

Modificación de filas sobre la marcha (por ejemplo, para corregir los finales de línea de Windows)

Si un archivo puede contener finales de línea similares a Windows o Unix (o incluso una combinación de ambos), es posible que el reemplazo de texto deseado no funcione como se espera.

Muestra:

$ echo -e 'Entry 1\nEntry 2.1\tEntry 2.2\r\nEntry 3\r\n\r\n' \
> | awk -F'\t' '$1 != "" { print $1 }' \
> | hexdump -c
0000000   E   n   t   r   y       1  \n   E   n   t   r   y       2   .
0000010   1  \n   E   n   t   r   y       3  \r  \n  \r  \n            
000001d

Esto puede solucionarse fácilmente mediante una regla adicional que se inserta al principio del script awk:

/\r$/ { $0 = substr($0, 1, length($0) - 1) }

Debido a que la acción no termina con el next , las siguientes reglas se aplican como antes.

Muestra (con corrección de finales de línea):

$ echo -e 'Entry 1\nEntry 2.1\tEntry 2.2\r\nEntry 3\r\n\r\n' \
> | awk -F'\t' '/\r$/ { $0 = substr($0, 1, length($0) - 1) } $1 != "" { print $1 }' \
> | hexdump -c
0000000   E   n   t   r   y       1  \n   E   n   t   r   y       2   .
0000010   1  \n   E   n   t   r   y       3  \n                        
000001a

Modified text is an extract of the original Stack Overflow Documentation

Licenciado bajo CC BY-SA 3.0

No afiliado a Stack Overflow

awk
Manipulación de filas

Buscar..

Extraer líneas específicas de un archivo de texto

Extraer columna / campo específico de una línea específica

Modificación de filas sobre la marcha (por ejemplo, para corregir los finales de línea de Windows)