May 23, 2021 Shell - An example of programming
This week to explore file operations.
In the daily study and work, always in the constant and a variety of files, including ordinary text files, can be executed programs, documents with control characters, a variety of files stored directory, network socket files, equipment files. T hese files also have properties such as owner, size, creation, and modification dates. The file corresponds to some blocks of data of the file system, a continuous space for storage devices such as disks, and some character sets with different shapes for display devices.
In this section, in order to focus on the file itself, instead of delving into the file system and how the storage device organizes the file (discussed in more detail later), it explores the side
byte
S
o the string operation of
the Shell programming
paradigm described earlier will be widely used here, and the reading and writing of ordinary files is already very skilled, that is, "redirecting", which will be introduced here independently.
The philosophy of "everything is a file" is also deeply reflected in Shell programming, where the "digitalization" (file descriptor) of files under Linux is highly abstract.
Let's start with the various properties of a file, and then describe the general operation of a normal file.
First look at the properties of the file through its structure:
struct stat {
dev_t st_dev; /* 设备 */
ino_t st_ino; /* 节点 */
mode_t st_mode; /* 模式 */
nlink_t st_nlink; /* 硬连接 */
uid_t st_uid; /* 用户ID */
gid_t st_gid; /* 组ID */
dev_t st_rdev; /* 设备类型 */
off_t st_off; /* 文件字节数 */
unsigned long st_blksize; /* 块大小 */
unsigned long st_blocks; /* 块数 */
time_t st_atime; /* 最后一次访问时间 */
time_t st_mtime; /* 最后一次修改时间 */
time_t st_ctime; /* 最后一次改变时间(指属性) */
};
Let's look at these properties one by one, and if you need to look at a file property, use the
stat
command, which lists the information in the structure above.
In addition,
ls
command can display the relevant properties of the file, such as the -l parameter, after
-l
File types correspond to
st_mode
above, and there are many file types, such as regular files, symbolic links (hard links, soft links), pipeline files, device files (symbolic devices, block devices), socket files, etc., and different file types correspond to different functions and functions.
$ ls -l
total 12
drwxr-xr-x 2 root root 4096 2007-12-07 20:08 directory_file
prw-r--r-- 1 root root 0 2007-12-07 20:18 fifo_pipe
brw-r--r-- 1 root root 3, 1 2007-12-07 21:44 hda1_block_dev_file
crw-r--r-- 1 root root 1, 3 2007-12-07 21:43 null_char_dev_file
-rw-r--r-- 2 root root 506 2007-12-07 21:55 regular_file
-rw-r--r-- 2 root root 506 2007-12-07 21:55 regular_file_hard_link
lrwxrwxrwx 1 root root 12 2007-12-07 20:15 regular_file_soft_link -> regular_file
$ stat directory_file/
File: `directory_file/'
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: 301h/769d Inode: 521521 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2007-12-07 20:08:18.000000000 +0800
Modify: 2007-12-07 20:08:18.000000000 +0800
Change: 2007-12-07 20:08:18.000000000 +0800
$ stat null_char_dev_file
File: `null_char_dev_file'
Size: 0 Blocks: 0 IO Block: 4096 character special file
Device: 301h/769d Inode: 521240 Links: 1 Device type: 1,3
Access: (0644/crw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2007-12-07 21:43:38.000000000 +0800
Modify: 2007-12-07 21:43:38.000000000 +0800
Change: 2007-12-07 21:43:38.000000000 +0800
Description: As you can see from the first character on each line of the
ls
command, they are different, which reflects the type of different files.
d
the directory,
-
normal files (or hard links),
l
for symbolic links,
p
for pipe files, b and
b
for block devices and character
c
s
for socket
socket
respectively.
In
stat
the stat command, you can find the description at the end of the second line, as you can see from the above operation,
directory_file
is the directory, the result of
stat
command is
directory
null_char_dev_file
is
character special file
Typically, only directories, ordinary files, and symbolic links are used, and rarely encounter other types of files, but these files are useful and may involve device files, named pipes (FIFO) if you want to do embedded development or process communication, etc. Here's a simple way to reflect the differences between them (the principles are described in the next section, File Systems for Shell Programming Paradigms, and, if you're interested, you can find the role of device files online in advance, the differences between block devices and character devices, and how to write related device drivers in drivers, etc.).
For normal files: is a collection of characters, so you can read, write, and so on
$ echo "hello, world" > regular_file
$ cat regular_file
hello, world
New files can be created in the directory, so the directory is also called a folder, and the structure of the directory file is analyzed later, which actually holds the file names of the files below it.
$ cd directory_file
$ touch file1 file2 file3
For a famous pipeline, it's more interesting to do: if you want to read it, block it unless there's content, block it if you want to write it, block it unless someone reads it. I
t is often used in process communication.
You can open two
terminal1
terminal2
and try:
terminal1$ cat fifo_pipe #刚开始阻塞在这里,直到下面的写动作发生,才打印test字符串
terminal2$ echo "test" > fifo_pipe
For block devices, character devices, device files correspond
/dev/hda1
/dev/null
which should be used if you have used a USB stick, or if you have written a simple script: :-)
$ mount hda1_block_dev_file /mnt #挂载硬盘的第一个分区到/mnt下(关于挂载的原理,在下一节讨论)
$ echo "fewfewfef" > /dev/null #/dev/null像个黑洞,什么东西丢进去都消失殆尽
The last two files are
regular_file
file hard links and soft links, to read and write them, their content is the same, do not delete them in the past, they are not related to each other, hard links and soft links and what is the difference?
The former can be said to be the original file, the latter has only one
inode
there is no actual storage space,
stat
see the differences between
Blocks
inode
you can also
diff
their size.
$ ls regular_file*
ls regular_file* -l
-rw-r--r-- 2 root root 204800 2007-12-07 22:30 regular_file
-rw-r--r-- 2 root root 204800 2007-12-07 22:30 regular_file_hard_link
lrwxrwxrwx 1 root root 12 2007-12-07 20:15 regular_file_soft_link -> regular_file
$ rm regular_file # 删除原文件
$ cat regular_file_hard_link # 硬链接还在,而且里头的内容还有呢
fefe
$ cat regular_file_soft_link
cat: regular_file_soft_link: No such file or directory
Although the soft link file itself is still there, it is not readable because it does not store the content itself, which is the difference between soft link and hard link.
It is important to note that hard links do not cross the file system, while soft links do. In addition, hard links to directories are not allowed.
File types are divided into so many types from the Linux file system level, but ordinary files can still be re-divided (according to the "data structure" of the contents of the file), such
ELF
files,
odt
jpg
swap partition
swap
pdf
files. I
n addition to text files, most of them are binary files with specific structures, so special tools are needed to create and edit them. F
or the format of various types of files, you can refer to the relevant documentation standards.
However, it's worth learning more about how
ELF
files work under Linux, and if you're interested, it's recommended to
ELF
files, which are critical for embedded Linux engineers.
Although all kinds of ordinary files have their own operating tools, but you can read and write them directly, here first mentioned a few tools, back to discuss the details.
od
"Export" the contents of the file in octals or other formats.
strings
Read out the characters in the file (printable characters)
gcc
gdb
readelf
objdump, etc.:
文件分析、处理工具(
file
等:
processing tools (gcc
编译器、
gdb
调试器、
debugger, readelf
分析 ELF 文件,
objdump' decompilation tool)
Add a very important command,
file
used to view the properties of various types of files. C
ompared
stat
command, it can further identify the normal file, which is the regular file that the
stat
regular file
B
ecause
regular file
have a variety of different structures, it is interpreted differently and performs different actions with the support of the operating system. A
lthough under Linux, files are also suffixed so that users can easily identify the type of file, the Linux operating system recognizes various types of files based on the header, rather than the file suffix, making it less prone to errors when interpreting the appropriate files.
Here's a
file
the use of the file command.
$ file ./
./: directory
$ file /etc/profile
/etc/profile: ASCII English text
$ file /lib/libc-2.5.so
/lib/libc-2.5.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), not stripped
$ file /bin/test
/bin/test: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), stripped
$ file /dev/hda
/dev/hda: block special (3/0)
$ file /dev/console
/dev/console: character special (5/1)
$ cp /etc/profile .
$ tar zcf profile.tar.gz profile
$ file profile.tar.gz
profile.tar.gz: gzip compressed data, from Unix, last modified: Tue Jan 4 18:53:53 2000
$ mkfifo fifo_test
$ file fifo_test
fifo_test: fifo (named pipe)
For more usage, see
file
for the file command, for how the
file
command
magic
refer to the magic manual
/etc/file/magic
to see what
magic number
Linux, as a multi-user system, provides great convenience for multiple users to use the same system, for example, for files on a system, it distinguishes between different users by their owners in order to assign their permissions to operate on different files. F or easier management, the owner of the file includes the user to which the file belongs, as well as the user group to which the file belongs, because the user can belong to more than one group. Let's start with a brief introduction to the management of users and groups under Linux.
A set of commands is provided under Linux to manage users and groups, such as
useradd
groupadd
creating users, to
userdel
and
groupdel
for users, and
passwd
modify user passwords. O
f course, Linux also provides two corresponding
/etc/passwd
/etc/group
some systems place
/etc/shadow
For more information on their use, please refer to the following information, which is no longer covered here, but only some of the relationships between the file and the user.
$ chown 用户名:组名 文件名
If you want to recursively modify the owner of all files in a directory, you can add
-R
option.
From the file structure listed at the beginning of this section, you can see that there is only user
ID
and group
ID
ls -l
show user name and group name information, how is this achieved?
Here's a look at
-n
$ ls -n regular_file
-rw-r--r-- 1 0 0 115 2007-12-07 23:45 regular_file
$ ls -l regular_file
-rw-r--r-- 1 root root 115 2007-12-07 23:45 regular_file
As you
ls -n
shows
ID
and group
ID
ls -l
shows their names. R
emember the two
/etc/passwd
/etc/group
mentioned above? T
hey hold the corresponding
ID
and the user name, the
ID
and the group name, so
ls -l
to find their corresponding name
ID
of the file structure.
If you want to learn more about the implementation of the
ls -l
command,
strace
to see if it reads both profiles.
$ strace -f -o strace.log ls -l regular_file
$ cat strace.log | egrep "passwd|group|shadow"
2989 open("/etc/passwd", O_RDONLY) = 3
2989 open("/etc/group", O_RDONLY) = 3
Description:
strace
used to track system calls and signals.
Like other powerful tools such as
gdb
it is implemented based on
ptrace
system calls.
In fact, it is not good to separate the owner and permissions, because only their combination makes it possible to multi-user systems, otherwise it is not possible to isolate the operation of different users on a file, so let's introduce file operation permissions.
From the last 9 characters of the first column of the results of the
ls -l
command, you can see
rwxr-xr-x
which
st_mode
part of the file structure
st_mode
file type information and file permission information). T
his information can be divided into three parts,
rwx
r-x
r-x
which correspond to the user to which the file belongs, the group to which it belongs, and the other groups' operational permissions to the file, if
rwx
means readable,
-
no such permission. C
orrespondingly, it can be represented by octals, such
rwxr-xr-x
which can be expressed as binary 111101101, and the corresponding octals are 755.
Because of this, there are several ways to modify the operating rights of a file, all of which can be
chmod
command.
For example,
regular_file
to readable, writeable, and executable for all
rwxrwxrwx
can also be expressed as 111111111, translated into octals, or 777.
This allows you to modify this permission in two ways.
$ chmod a+rwx regular_file
Or
$ chmod 777 regular_file
Description:
a
to the user used, if you only want to give the
+
user readable and writeable executable permissions, then you can change a to
u
a
is to add permissions, on the contrary, if you want to drop a
-
and
rwx
to readable, writeable, executable.
For more usage, see
chmod
command.
In addition to these permissions, there are two security-related
setuid/setgid
read-only control.
If
setuid/setgid
permission for a file (program or command) is set, the user will be able to execute the file as
root
which can be a security risk, and if read-only permissions for the file are
rm -rf
By default, the system does not allow ordinary users
passwd
and
setuid/setgid
ordinary users to execute it.
$ ls -l /usr/bin/passwd
-rwx--x--x 1 root root 36092 2007-06-19 14:59 /usr/bin/passwd
$ su #切换到root用户,给程序或者命令添加“粘着位”
$ chmod +s /usr/bin/passwd
$ ls -l /usr/bin/passwd
-rws--s--x 1 root root 36092 2007-06-19 14:59 /usr/bin/passwd
$ exit
$ passwd #普通用户通过执行该命令,修改自己的密码
Description:
setuid
setgid
are programs orroot
to run only rootroot
as root users.
Although this provides some convenience for management, for example, the above actions allow ordinary users to
root
to do this for each user.
For
setuid/setgid
please refer to the last recommended information.
Read-only permission example: Lock important files (add non-modifiable bits (immutable)) to avoid the disastrous consequences of various misoperations
:``rm -rf
$ chattr +i regular_file
$ lsattr regular_file
----i-------- regular_file
$ rm regular_file #加immutable位后就无法对文件进行任何“破坏性”的活动啦
rm: remove write-protected regular file `regular_file'? y
rm: cannot remove `regular_file': Operation not permitted
$ chattr -i regular_file #如果想对它进行常规操作,那么可以把这个位去掉
$ rm regular_file
Description:
chattr
be used to set special permissions for files, and for more usage, refer
chattr
help.
File size for ordinary files is the size of the contents of the file, and directory as a special file, it holds the contents of the directory structure organized by all kinds of file information, so the size of the directory is generally fixed, it holds the number of files naturally also has the upper limit, that is, its size divided by the length of the file name. T
he "file size" of the device file corresponds to the device's primary and secondary device numbers, while the famous pipe file is often 0 because of its special read and write nature. A
hard link (a directory file cannot create a hard link) is essentially a complete copy of the original file, so its size is the size of the original file. T
he soft link is just an
inode
a pointer to the original file, so its size is only the number of bytes of the original file name.
Let's add memory through the demo.
Example of the original file, linked file size:
$ echo -n "abcde" > regular_file #往regular_file写入5字节
$ ls -l regular_file*
-rw-r--r-- 2 root root 5 2007-12-08 15:28 regular_file
-rw-r--r-- 2 root root 5 2007-12-08 15:28 regular_file_hard_file
lrwxrwxrwx 1 root root 12 2007-12-07 20:15 regular_file_soft_link -> regular_file
lrwxrwxrwx 1 root root 22 2007-12-08 15:21 regular_file_soft_link_link -> regular_file_soft_link
$ i="regular_file"
$ j="regular_file_soft_link"
$ echo ${#i} ${#j} #软链接存放的刚好是它们指向的原文件的文件名的字节数
12 22
File size for device number: primary and secondary device number
$ ls -l hda1_block_dev_file
brw-r--r-- 1 root root 3, 1 2007-12-07 21:44 hda1_block_dev_file
$ ls -l null_char_dev_file
crw-r--r-- 1 root root 1, 3 2007-12-07 21:43 null_char_dev_file
Supplement:
(major)、次
(minor) device numbers have different effects. W
hen a device file is opened, the
major number
the kernel that are already registered with the primary device number (you can see the corresponding
cat /proc/devices
secondary device number
minor number
is passed to the driver itself via the kernel (see Chapter X of The Linux Primer).
Therefore, for the kernel, the corresponding driver can be found to identify a device through the primary device number, and for the driver, in order to have more complex access to the device, such as access to different parts of the
hda1
hda2
hda3
such as random numbers that produce different
/dev/random
/dev/urandom
etc.).
The size of the directory file, why is this so? Look at the size of the directory structure below, and the block of the directory file holds the entry to all the file names in that directory.
$ ls -ld directory_file/
drwxr-xr-x 2 root root 4096 2007-12-07 23:14 directory_file/
The structure of the directory is as follows:
struct dirent {
long d_ino;
off_t d_off;
unsigned short d_reclen;
char d_name[NAME_MAX+1]; /* 文件名称 */
}
The time property of the file can record the user's operation information on the file, and will provide reference to the administrator in the case of system management, judgment of file version information, etc.
Therefore, when reading a file, it is recommended to use a reading tool such as
cat
not
vim
because even if no modifications are made, the timestamp information of the file will be modified once the save command is executed.
The file name is not stored in the file structure, but in the directory structure in which it is located. Therefore, the file name must be unique at the same level as the directory.
For files, common actions include creating, deleting, modifying, reading, writing, and so on. The "back action" for each operation will be analyzed in detail in the next chapter, File System Operations for shell programming paradigms.
socket
are a special class of files that can be created in C, not described here (I don't know if you can create them directly with commands for the time being), and other files are created by commands.
$ touch regular_file #创建普通文件
$ mkdir directory_file #创建目录文件,目录文件里头可以包含更多文件
$ ln regular_file regular_file_hard_link #硬链接,是原文件的一个完整拷比
$ ln -s regular_file regular_file_soft_link #类似一个文件指针,指向原文件
$ mkfifo fifo_pipe #或者通过 "mknod fifo_pipe p" 来创建,FIFO满足先进先出的特点
$ mknod hda1_block_dev_file b 3 1 #块设备
$ mknod null_char_dev_file c 1 3 #字符设备
Creating a file actually adds a node
inode),该节点信息将保存到文件系统的节点表中。更形象地说,就是在一颗树上长了一颗新的叶子(文件)或者枝条(目录文件,上面还可以长叶子的那种),这些可以通过
with a new leaf (file) or branch (directory file, the kind that can also grow leaves on top), which can be visually displayed by the tree command or the ls' command.
命令或者
File system from the point of view of daily use, can be seen as an upside-down tree, because they are too much like, too easy to remember.
$ tree 当前目录
Or
$ ls 当前目录
The most immediate impression of deleting a file is that it no
ls
and this can also be presented by the ls or
tree
command, as if a tree had been cut off a branch or a leaf removed.
In fact, these files are not deleted immediately after, but only made a delete mark, so if after deletion, there is no relevant disk write to "overwrite" the corresponding disk space, then the principle is recoverable (although, but this kind of work is often very cumbersome, so when deleting some important data, be sure to think twice, such as doing a good backup work), the corresponding approach can refer to the follow-up information.
The command to delete the file specifically
rm
and if you want to delete an empty directory,
rmdir
command.
For example:
$ rm regular_file
$ rmdir directory_file
$ rm -r directory_file_not_empty
rm
has two very important parameters,
-f
this command is very "barbaric", it is estimated to cause pain to many Linux users,
-i
is very "gentle", it is estimated to make many users feel irritable.
Which or according to your "mood", if you do a good job of backup, or take some effective action to avoid catastrophic consequences, you can do these work can rest assured.
A copy of a file usually refers to a "temporary" copy of the contents of a file. T hrough the introduction at the beginning of this section, we should understand that the hard and soft links of the file are also in a sense "copying the file", the former copying the file content synchronously, and the latter copying the file content in the case of reading and writing. For example:
Copy
cp
with the cp command (copy directory
-r
option)
$ cp regular_file regular_file_copy
$ cp -r diretory_file directory_file_copy
Create a hard link
link
copy
because the latter is a synchronous update, the former is not, and after replication the two are no longer related)
$ ln regular_file regular_file_hard_link
Create a soft link
$ ln -s regular_file regluar_file_soft_link
Modifying the file name actually only modifies the file name identifier.
The
mv
allows you to modify the file name operation (that is, rename it).
$ mv regular_file regular_file_new_name
Editing the file is actually the content of the operation file, corresponding to the editing of ordinary text files, which mainly involves the reading, writing, appending, deletion and so on. T
his is usually done through specialized editors such as
vim
emacs
and
gedit,kedit
under the command line, and graphical interfaces. I
f it's a specific file, there are specialized editing and processing tools, such as
gimp
document editing
OpenOffice
so on.
These tools generally have specialized tutorials.
Here's a quick introduction to these general editing operations for files under Linux through redirection.
Create a file and write to
abcde
$ echo "abcde" > new_regular_file
Add another line of
abcde
to the file above
$ echo "abcde" >> new_regular_file
Read a file by line
$ while read LINE; do echo $LINE; done < test.sh
Tip: If you want to execute a string variable that contains redirects as a command, use
eval
command, otherwise the redirect cannot be interpreted.
For example
$ redirect="echo \"abcde\" >test_redirect_file"
$ $redirect #这里会把>当作字符 > 打印出来,而不会当作 重定向 解释
"abcde" >test_redirect_file
$ eval $redirect #这样才会把 > 解释成 重定向
$ cat test_redirect_file
abcde
Compressing and unziping files is in a sense intended to facilitate the transfer of file content, but there may also be specific uses, such as kernel and file system image files (more relevant knowledge, please refer to the following information).
Here are just a few common compression and decompression methods:
tar
$ tar -cf file.tar file #压缩
$ tar -xf file.tar #解压
gz
$ gzip -9 file
$ gunzip file
tar.gz
$ tar -zcf file.tar.gz file
$ tar -zxf file.tar.gz
bz2
$ bzip2 file
$ bunzip2 file
tar.bz2
$ tar -jcf file.tar.bz2 file
$ tar -jxf file.tar.bz2
Through the above demonstration, should the role of
tar
bzip2,
bunzip2,
gzip,
gunzip command
命令的角色了吧?如果还不清楚,多操作和比较一些上面的命令,并查看它们的手册:
man tar'...
File search refers to finding the location of a file with certain properties in the file system at a directory hierarchy that, if extended to the entire network, can be
URL
file://+
a local address. T
he local path starts with
/
under a Linux system, for example, each user's home directory can be
file:///home/
Here are just a few ways to search for local files.
find
command provides a "timely" search method that traverses all files at the specified directory level at the user's request until the required files are found.
updatedb+locate
a "fast" search
updatedb
and generating a local file
locate
by file name to quickly find the appropriate file. T
he former supports search through a variety of file properties and provides an interface
-exec
option) for processing post-search files. T
his is therefore extremely convenient for enthusiasts of "single command" scripts, but for searches based on file names,
updatedb+locate
can be significantly more efficient.
Here's a quick look at both approaches:
find
command basically uses the demo
$ find ./ -name "*.c" -o -name "*.h" #找出所有的C语言文件,-o是或者
$ find ./ \( -name "*.c" -o -name "*.h" \) -exec mv '{}' ./c_files/ \;
# 把找到的文件移到c_files下,这种用法非常有趣
The above usage can be
xargs
command
$ find ./ -name "*.c" -o -name "*.h" | xargs -i mv '{}' ./c_files/
# 如果要对文件做更复杂的操作,可以考虑把mv改写为你自己的处理命令,例如,我需要修
Change all file name suffixes to capitals.
$ find ./ -name "*.c" -o -name "*.h" | xargs -i ./toupper.sh '{}' ./c_files/
toupper.sh
a processing file that we need to implement that converts small to capital, as follows:
$ cat toupper.sh
#!/bin/bash
# the {} will be expended to the current line and becomen the first argument of this script
FROM=$1
BASENAME=${FROM##*/}
BASE=${BASENAME%.*}
SUFFIX=${BASENAME##*.}
TOSUFFIX="$(echo $SUFFIX | tr '[a-z]' '[A-Z]')"
TO=$2/$BASE.$TOSUFFIX
COM="mv $FROM $TO"
echo $COM
eval $COM
updatedb+locate
uses the demo
$ updatedb #更新库
$ locate find*.gz #查找包含find字符串的所有gz压缩包
In fact, in addition to the two commands above, Linux has a command finder:
which
is used to return the full path of a command, and
whereis
which is used to return the path
man 文件的路径。例如,查找
the absolute path of the find' command:
$ which find
/usr/bin/find
$ whereis find
find: /usr/bin/find /usr/X11R6/bin/find /usr/bin/X11/find /usr/X11/bin/find /usr/man/man1/find.1.gz /usr/share/man/man1/find.1.gz /usr/X11/man/man1/find.1.gz
It should be mentioned that if you want to search for a file based on the contents of the file, then there is nothing you can do about
find
and
updatedb+locate
and
whereis
the optional method is
grep
sed
-r
to search for the
-i
the contents of the file.
which
Their basic usage has been detailed in the previous sections and will not be repeated here.
It is important to note that these commands make a lot of sense for the operation of the file. T o some extent, they abstract the file system structure, making the operation of the entire file system simplified to operations on a single file, which, if only the text portion is considered, ends up being converted to a previous string operation, which was discussed in the previous section. To get a clearer picture of the structure of the file, the relationship between the files, the file system will be explored in depth in the next section.
Loading, parsing, and instance analysis of dynamic links to ELF files in Linux under the Intel platform: