KnowHow

技術的なメモを中心にまとめます。
検索にて調べることができます。

[Thread版:改良4]pythonとbashを用いたLinuxサーバのバックアッププログラム

登録日 :2025/07/13 20:50
カテゴリ :Linux

バックアッププログラムの改良版。ソースディレクトリ、バックアップディレクトリをsettings.pyで指定するようにしました。
スクリプトパスについても、os, sysを使ってパス情報を自動で取得するようにして、初期設定やメンテナンスをしやすくしました。
バックアップするbashで用いるコマンドはrsyncですが、フルバックアップと差分バックアップを分ける必要がないように思ったので、コマンドをシンプルに見直しました。

Homeディレクトリに多数のユーザがいる場合、バックアップに時間がかかる。

バックアップサーバとメインサーバ間はインフィニバンドで接続してデータ転送速度が良いため、IOバウンドがボトルネックとなる。そのため、できる限り帯域を効率的に用いるには、Threadなどでバックアップ処理を並列化したほうが良い。

Home領域の増減もあるため、pythonを用いて自動的にhome領域のディレクトリを取得して、Threadでバックアップ処理を実施するプログラムをとする。


フォルダ構成

-rwxr-xr-x. 1 root root 8523  7月 13 17:40 backup_thread.py
drwxr-xr-x. 4 root root   79  7月 13 18:10 config
drwxr-xr-x. 3 root root   45  6月 22 20:02 log
drwxr-xr-x. 4 root root   72  7月 13 18:02 script

バックアップスクリプト(シェル)

script/backup.sh

#!/bin/bash

usage() {
    echo "Usage: $0 <source_directory> <backup_directory>"
    echo "Example: $0 /home/user /backup_dir"
}

backup() {
    local SOURCE_DIR="$1"
    local BACKUP_BASE="$2"
    # basename /home/user -> user
    # local BACKUP_DIR="$BACKUP_BASE/$(basename "$SOURCE_DIR")"
    local BACKUP_DIR="$BACKUP_BASE$SOURCE_DIR"

    # バックアップ先ディレクトリ作成
    mkdir -p "$BACKUP_DIR"

    # rsyncによるバックアップ(常に最新を反映)
    rsync -az --delete \
        --exclude='.cache' \
        --exclude='*.tmp' \
        --exclude='*.log' \
        "$SOURCE_DIR/" "$BACKUP_DIR/"

    echo "Backup completed: $BACKUP_DIR"
}

main() {
    if [[ $# -ne 2 ]]; then
        usage
        exit 1
    fi

    backup "$1" "$2"
}

main "$@"

スクリプトの実行は、以下のように行うことができる

# ./backup.sh /home/N1001 /backup_dird
Backup completed: /backup_dir/home/N1001

Threadで実行するための設定ファイル(settings.py)

config/settings.py

"""
created by N.Tagawa
version 2025.07.13
Please Change Option
 number of threading -> integer
 Timeout             -> integer
"""

# Directory Settings
BASE_DIR       = "/home/APPLI/TOOLS/backup_script/"
LOG_FILE       = BASE_DIR + "log/check_result.log"
BACKUP_SCRIPT  = BASE_DIR + "script/backup.sh"
USERLIST       = BASE_DIR + "config/userlist.txt"
BACKUP_DIR     = "/backup_dir"

# Threading settings
THREADING_NUM  = 4
PROCESSES_NUM  = 4
# 4days = 345600sec
TIMEOUT        = 345600


# If you use userlist.txt
# and you don't use FetchHomeDir() then True.
USERLIST_FLG   = False
HOME_DIR       = "/home"


# Debug and Test Code Use Flg
DEBUG          = False
TEST_CODE      = False

Threadで行うPython コード(backup_thread.py)

backup_thread.py

#!/usr/bin/python3

from abc import ABC, abstractmethod
import subprocess
from subprocess import PIPE
import queue
import threading
import logging
import time
import datetime
import signal
import os
import sys
import gc
import socket

dir_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(dir_path)

from config import settings

"""
created by Tagawa.
Backup home/*
Settings by config/settings.py
method Thread
version 2025.06.22
"""

logging.basicConfig(
        filename=settings.LOG_FILE,
        level=logging.INFO,
        format='%(asctime)s:%(name)s:%(levelname)s:%(threadName)s:%(message)s')
logger = logging.getLogger(__name__)

logger.debug({'add path': dir_path})


class ShellCommand(object):
    def __init__(self, dt_now, timeout: int, command: str):
        self.stdout = False
        self.stderr = False
        self.returncode = False
        self.command = False
        self._timeout = timeout
        self._command = command
        self._dt_now = dt_now
        self._command_result = False
        self._errlog = False

    def submit_command(self, command):
        self.command = command
        result = subprocess.run(
                self.command,
                shell=True,
                stdout=PIPE,
                stderr=PIPE,
                timeout=self._timeout)
        self.stdout = result.stdout.decode('utf-8')
        self.stderr = result.stderr.decode('utf-8')
        self.returncode = result.returncode

        if result.returncode != 0:
            raise Exception(self.stderr)

    def execute_command(self):
        try:
            self.submit_command(self._command)
            self._command_result = self.stdout
        except Exception as e:
            self._command_result = self.stderr
            self._errlog = str(e)
            logger.error({
                'time': self._dt_now,
                'status': 'failed',
                'action':'ExceuteShellComand',
                'error': self._errlog,
                'command': self._command})


class FetchHomeDir(object):
    def __init__(self, dt_now, timeout, home, debug_flg):
        self._dt_now = dt_now
        self._timeout = timeout
        self._home = home
        self._status = None
        self._command = 'ls -a ' + home
        self.shell = ShellCommand(dt_now, timeout, self._command)
        self.homedirs = []
        self.debug_flg = debug_flg

    def run_command(self):
        self.shell.execute_command()
        logger.debug({'return command result': self.shell._command_result})
        if not self.shell._errlog and self.shell._command_result != "":
            self._status = 'success'
            homedirs = self.shell._command_result.split('\n')
            for _home in homedirs[2:]:
                # skip '.', '..'
                if _home != "":
                    _path = self._home + '/' + _home
                    logger.debug(_path)
                    self.homedirs.append(_path)
        else:
            self._status = 'failed'
            logger.error({
                'time': self._dt_now,
                'status': self._status,
                'action': FetchHomeDir,
                'command': self._command,
                'home': self._home})

        if self.debug_flg:
            print('FetchHomeDir: debug print')
            print(f'{self._status}: {__file__} FetchHomeDir from {self._home}')
            for _home in self.homedirs:
                print(_home)
            print('')


class IThreadWorker(ABC):
    def __init__(self, dt_now, queue, num_of_thread, timeout):
        self.dt_now = dt_now
        self.queue = queue
        self.num_of_thread = num_of_thread
        self.timeout = timeout
        self.command = None

    def run(self):
        ts = []
        for _ in range(self.num_of_thread):
            t = threading.Thread(target=self.worker)
            t.start()
            ts.append(t)
        [self.queue.put(None) for _ in range(len(ts))]
        [t.join() for t in ts]

    @abstractmethod
    def worker(self):
        logging.debug('start')
        while True:
            item = self.queue.get()
            if item is None:
                break
            print({'thread': item})
            self.some_process()
            self.queue.task_done()
        logging.debug('end')

    def some_process(self):
        pass


class ThreadBackup(IThreadWorker):
    def __init__(
            self, dt_now, queue, num_of_thread, timeout, backup_script, backup_dir, debug_flg):
        super().__init__(dt_now, queue, num_of_thread, timeout)
        #test-------------------------------
        #self.command = 'ls -l '
        #self.command = 'sleep 3 || ls -l '
        #-----------------------------------
        self.command = backup_script + ' '
        self.backup_dir = backup_dir
        self.debug_flg = debug_flg
        self.result = []

    def worker(self):
        logging.debug('start')
        while True:
            path_dir = self.queue.get()
            if path_dir is None:
                break
            if self.debug_flg:
                print(f"{threading.current_thread().name}: {path_dir}")
                self.check_home_dir(path_dir)
            else:
                self.start_backup(path_dir, self.backup_dir)
            self.queue.task_done()
        logging.debug('end')

    def check_home_dir(self, path):
        try:
            _command = 'ls -a ' + path
            _shell = ShellCommand(self.dt_now, self.timeout, _command)
            _shell.execute_command()
            print(_shell._command_result.split('\n'))

        except Exception as e:
            print({'command Error': str(e)})

    def start_backup(self, path, backup_dir):
        try:
            _command = self.command + path + ' ' + backup_dir
            _shell = ShellCommand(self.dt_now, self.timeout, _command)
            _shell.execute_command()
            logger.info({
                'status': 'success',
                'source': path,
                'result': _shell._command_result.split('\n')[0],
                })

        except Exception as e:
            print({'command Error': str(e)})
            logger.error({
                'time': self.dt_now,
                'status': 'failed',
                'action': 'ThreadHomeDirChecker',
                'message': str(e),
                'path': path})


"""
userliset(home dir list) form text files.
"""
def read_userlist(filename):
    with open(filename, 'r', encoding='utf-8') as f:
        lines = [line.strip() for line in f if line.strip()]
    return lines


"""
test code
"""
def test_fetch_home_dir():

    timeout = settings.TIMEOUT
    dt_now = datetime.datetime.now().strftime('%Y/%m/%d %H:%M:%S')

    home = '/home'
    check_home_dir = FetchHomeDir(dt_now, timeout, home)
    check_home_dir.run_command()

    print({'result': check_home_dir.homedirs})


"""
main code
"""
def main_thread():

    # import settings
    timeout = settings.TIMEOUT
    threads = settings.THREADING_NUM

    backup_script = settings.BACKUP_SCRIPT
    backup_dir = settings.BACKUP_DIR

    debug_flg = settings.DEBUG
    test_flg = settings.TEST_CODE

    home = settings.HOME_DIR
    filename = settings.USERLIST
    userlist_flg = settings.USERLIST_FLG

    # test code
    if test_flg:
        test_fetch_home_dir()
        print('--- test end ---')
        sys.exit(0)

    # initial set
    dt_now = datetime.datetime.now().strftime('%Y/%m/%d %H:%M:%S')
    homedirs_queue = queue.Queue()
    start = time.time()

    # start main
    logger.info({'Start Backup': dt_now})

    # set queue
    if userlist_flg:
        homedirlist = read_userlist(filename)
    else:
        fetch_home_dir = FetchHomeDir(dt_now, timeout, home, debug_flg)
        fetch_home_dir.run_command()
        homedirlist = fetch_home_dir.homedirs
        del fetch_home_dir

    print('put:homedir_queue')
    for homedir in homedirlist:
        print(homedir)
        homedirs_queue.put(homedir)

    # backup start
    thread_home_backup = ThreadBackup(
            dt_now, homedirs_queue, threads, timeout, backup_script, backup_dir, debug_flg)
    thread_home_backup.run()

    end = time.time()
    logger.info({
        'action': 'Backup by threads',
        'elapsed time': '{: 4f} sec'.format(end - start)})


    print('thread time: {: 4f}\n'.format(end - start))

    del thread_home_backup,homedirs_queue, homedirlist
    gc.collect()


if __name__ == '__main__':

    main_thread()

実行例(backup_thread.py)

Threadを用いたバックアップスクリプトの実行例

[root@ManageServer backup_script]# ls /backup_dir/
[root@ManageServer backup_script]#

[root@ManageServer backup_script]# ./backup_thread.py
put:homedir_queue
/home/APPLI
/home/N1001
/home/N1002
/home/N1003
/home/N1004
/home/N1005
/home/N1006
/home/N1007
/home/N1008
/home/N1009
/home/N1010
/home/download
/home/install
/home/settings
/home/user01
thread time:  10.465218

[root@ManageServer backup_script]#
[root@ManageServer backup_script]# ls /backup_dir/
home
[root@ManageServer backup_script]#
[root@ManageServer backup_script]# ll /backup_dir/home
合計 0
drwxr-xr-x. 3 root   root   19  6  8 16:57 APPLI
drwx------. 3 N1001  N1001  94  7 13 17:52 N1001
drwx------. 3 N1002  N1002  78  6  8 19:09 N1002
drwx------. 3 N1003  N1003  78  6  8 19:09 N1003
drwx------. 3 N1004  N1004  78  6  8 19:09 N1004
drwx------. 3 N1005  N1005  78  6  8 19:09 N1005
drwx------. 3 N1006  N1006  78  6  8 19:09 N1006
drwx------. 3 N1007  N1007  78  6  8 19:09 N1007
drwx------. 3 N1008  N1008  78  6  8 19:09 N1008
drwx------. 3 N1009  N1009  78  6  8 19:09 N1009
drwx------. 3 N1010  N1010  78  6  8 19:09 N1010
drwxr-xr-x. 5 root   root   47  5  4 13:24 download
drwxrwxrwx. 6 root   root   66  6  8 16:57 install
drwxr-xr-x. 4 root   root   86  6  8 19:07 settings
drwx------. 3 user01 user01 78  6  7 21:57 user01
[root@ManageServer backup_script]#

ログファイルには以下のように実行結果が保存される
log/check_result.log

[root@ManageServer backup_script]# cat log/check_result.log
2025-07-13 21:03:33,838:__main__:INFO:MainThread:{'Start Backup': '2025/07/13 21:03:33'}
2025-07-13 21:03:33,872:__main__:INFO:Thread-4:{'status': 'success', 'source': '/home/N1003', 'result': 'Backup completed: /backup_dir/home/N1003'}
2025-07-13 21:03:33,874:__main__:INFO:Thread-2:{'status': 'success', 'source': '/home/N1001', 'result': 'Backup completed: /backup_dir/home/N1001'}
2025-07-13 21:03:33,890:__main__:INFO:Thread-4:{'status': 'success', 'source': '/home/N1004', 'result': 'Backup completed: /backup_dir/home/N1004'}
2025-07-13 21:03:33,902:__main__:INFO:Thread-4:{'status': 'success', 'source': '/home/N1006', 'result': 'Backup completed: /backup_dir/home/N1006'}
2025-07-13 21:03:33,913:__main__:INFO:Thread-3:{'status': 'success', 'source': '/home/N1002', 'result': 'Backup completed: /backup_dir/home/N1002'}
2025-07-13 21:03:33,928:__main__:INFO:Thread-2:{'status': 'success', 'source': '/home/N1005', 'result': 'Backup completed: /backup_dir/home/N1005'}
2025-07-13 21:03:33,941:__main__:INFO:Thread-2:{'status': 'success', 'source': '/home/N1009', 'result': 'Backup completed: /backup_dir/home/N1009'}
2025-07-13 21:03:33,954:__main__:INFO:Thread-4:{'status': 'success', 'source': '/home/N1007', 'result': 'Backup completed: /backup_dir/home/N1007'}
2025-07-13 21:03:33,957:__main__:INFO:Thread-2:{'status': 'success', 'source': '/home/N1010', 'result': 'Backup completed: /backup_dir/home/N1010'}
2025-07-13 21:03:33,966:__main__:INFO:Thread-3:{'status': 'success', 'source': '/home/N1008', 'result': 'Backup completed: /backup_dir/home/N1008'}
2025-07-13 21:03:34,079:__main__:INFO:Thread-3:{'status': 'success', 'source': '/home/settings', 'result': 'Backup completed: /backup_dir/home/settings'}
2025-07-13 21:03:34,171:__main__:INFO:Thread-3:{'status': 'success', 'source': '/home/user01', 'result': 'Backup completed: /backup_dir/home/user01'}
2025-07-13 21:03:34,585:__main__:INFO:Thread-4:{'status': 'success', 'source': '/home/download', 'result': 'Backup completed: /backup_dir/home/download'}
2025-07-13 21:03:34,783:__main__:INFO:Thread-1:{'status': 'success', 'source': '/home/APPLI', 'result': 'Backup completed: /backup_dir/home/APPLI'}
2025-07-13 21:03:44,302:__main__:INFO:Thread-2:{'status': 'success', 'source': '/home/install', 'result': 'Backup completed: /backup_dir/home/install'}
2025-07-13 21:03:44,303:__main__:INFO:MainThread:{'action': 'Backup by threads', 'elapsed time': ' 10.465218 sec'}