Qiang Blog

Just another zhangjingqiang's blog.

使用Google API自动把本地文件内容写入Google Spreadsheet

需要使用的Google API

  • Google Drive API
  • Google Spreadsheet API

实现的功能

  • 获取Google认证权限
  • 在Google Drive创建Google Spreadsheet
  • 分享给用户,域名功能
  • 删除原有sheet1并创建新的固定sheet
  • 向Google Spreadsheet循环创建并写入本地指定路径下所有文件内容
  • 设置sheet头部样式(标题,颜色,固定,加粗等)

此例文件类型

  • tsv

使用方法

Step 1

从Google开发者页面下载 client_secret.json
参考:

https://developers.google.com/sheets/api/quickstart/python

Step 2

在terminal运行:

$ python write_to_google_sheets.py <file_name> <folder_path>

精彩代码

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
import httplib2
import os
import sys
import json
import re
from termcolor import colored

from apiclient import discovery
from oauth2client import client
from oauth2client import tools
from oauth2client.file import Storage
import requests

reload(sys)
sys.setdefaultencoding('utf8')

# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/sheets.googleapis.com-python-quickstart.json
SCOPES = "https://www.googleapis.com/auth/drive https://www.googleapis.com/auth/spreadsheets"
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Google API Drive + Spreadsheet'

def get_credentials():
    """Gets valid user credentials from storage.

    If nothing has been stored, or if the stored credentials are invalid,
    the OAuth2 flow is completed to obtain the new credentials.

    Returns:
        Credentials, the obtained credential.
    """
    home_dir = os.path.expanduser('~')
    credential_dir = os.path.join(home_dir, '.credentials')
    if not os.path.exists(credential_dir):
        os.makedirs(credential_dir)
    credential_path = os.path.join(credential_dir,
                                   'sheets.googleapis.com-python-quickstart.json')

    store = Storage(credential_path)
    credentials = store.get()
    if not credentials or credentials.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        flow.user_agent = APPLICATION_NAME
        if flags:
            credentials = tools.run_flow(flow, store, flags)
        else: # Needed only for compatibility with Python 2.6
            credentials = tools.run(flow, store)
        print('Storing credentials to ' + credential_path)
    return credentials

def main():
    """Create spreadsheet and write data into it
    Use:
    - Google Drive API
    - Google Spreadsheet API
    """
    credentials = get_credentials()
    http = credentials.authorize(httplib2.Http())

    # Create new spreadsheet
    spreadsheet_id = create_spreadsheet(http)

    # Create new sheet
    write_data_to_sheets(http, spreadsheet_id)

    print(colored('Finish!', 'green'))

def create_spreadsheet(http):
    drive_service = discovery.build('drive', 'v3', http=http)
    file_metadata = {
      'name' : sys.argv[1],
      'mimeType' : 'application/vnd.google-apps.spreadsheet'
    }
    file = drive_service.files().create(body=file_metadata, fields='id').execute()
    # share spreadsheet
    share(drive_service, file.get('id'))
    return file.get('id')

def write_data_to_sheets(http, spreadsheet_id):
    discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?'
                    'version=v4')
    spreadsheet_service = discovery.build('sheets', 'v4', http=http,
                              discoveryServiceUrl=discoveryUrl)

    files = os.listdir(sys.argv[2])
    print(colored(files, 'red'))

    # create new sheets and delete default sheet1
    spreadsheet_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body=base_sheet()).execute()
    # write log files to sheets
    for file in files:
        write_log_to_sheet(spreadsheet_service, spreadsheet_id, file)

def base_sheet():
    data = {
      'requests': [
        {
          'addSheet':{
            'properties': {'title': u'New Sheet Name'}
          }
        },
        {
          'deleteSheet':{
            'sheetId': 0
          }
        }
      ]
    }
    return data

def write_log_to_sheet(spreadsheet_service, spreadsheet_id, file):
    # make sheet
    body = {
      'requests': [
        {
          'addSheet':{
            'properties': {'title': u'{0}'.format(file)}
          }
        }
      ]
    }
    result = spreadsheet_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body=body).execute()
    sheetId = result['replies'][0]['addSheet']['properties']['sheetId']

    # write to sheet
    values = []
    values.append(titles(file))
    lines = [line.rstrip() for line in open(file)]
    for i in range(len(lines)):
        values.append(re.split(r'\t+', lines[i]))
    data = [
            {
                'range': '{0}!A1'.format(file),
                'values': values
            }
    ]
    body = {
            'valueInputOption': 'USER_ENTERED',
            'data': data
    }
    spreadsheet_service.spreadsheets().values().batchUpdate(spreadsheetId=spreadsheet_id, body=body).execute()

    # format header row
    body = {
    "requests": [
    {
      "repeatCell": {
        "range": {
          "sheetId": sheetId,
          "startRowIndex": 0,
          "endRowIndex": 1
        },
        "cell": {
          "userEnteredFormat": {
            "backgroundColor": {
              "red": 0.0,
              "green": 0.0,
              "blue": 1.0
            },
            "horizontalAlignment" : "LEFT",
            "textFormat": {
              "foregroundColor": {
                "red": 1.0,
                "green": 1.0,
                "blue": 1.0
              },
              "fontSize": 12,
              "bold": 'true'
            }
          }
        },
        "fields": "userEnteredFormat(backgroundColor,textFormat,horizontalAlignment)"
      }
    },
    {
      "updateSheetProperties": {
        "properties": {
          "sheetId": sheetId,
          "gridProperties": {
            "frozenRowCount": 1
          }
        },
        "fields": "gridProperties.frozenRowCount"
      }
    }
    ]
    }
    spreadsheet_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body=body).execute()

def titles(file):
    values_title = [
        'ID', 'Name', 'Description'
    ]
    return values_title

def share(drive_service, spreadsheet_id):
    batch = drive_service.new_batch_http_request(callback=callback)
    # share to users
    user_permission = [
        'user1@example.com',
        'user2@example.com'
    ]
    for user in user_permission:
        share_user(drive_service, spreadsheet_id, batch, user)
    # share to domains
    domain_permission = [
        'google.com',
        'facebook.com'
    ]
    for domain in domain_permission:
        share_domain(drive_service, spreadsheet_id, batch, domain)
    # batch execute
    batch.execute()

def share_user(drive_service, spreadsheet_id, batch, user):
    user_permission = {
            'type': 'user',
            'role': 'writer',
            'emailAddress': user
            }
    batch.add(drive_service.permissions().create(
        fileId=spreadsheet_id,
        body=user_permission,
        fields='id',
        ))
    return batch

def share_domain(drive_service, spreadsheet_id, batch, domain):
    domain_permission = {
            'type': 'domain',
            'role': 'reader',
            'domain': domain
            }
    batch.add(drive_service.permissions().create(
        fileId=spreadsheet_id,
        body=domain_permission,
        fields='id',
        ))
    return batch

def callback(request_id, response, exception):
    if exception:
        print(exception)
    else:
        print("Permission Id: {0}".format(response.get('id')))

if __name__ == '__main__':
    if len(sys.argv) == 3 and os.path.isdir(sys.argv[2]):
        main()
    else:
        print('Please input a file name and log path.')

google-drive-api google-spreadsheet-api python

How to bulk compare images by imagemagick with python and ruby?

If we want to compare all images in two different path, we can save them into two files and bulk compare them by batch script.

For example:

# old
1.png
2.png
# new
1.png
2.png( - Different image)

Then run the python or ruby script:

#!/usr/bin/python
# coding: UTF-8

import os
import sys
import subprocess
from termcolor import colored

reload(sys)
sys.setdefaultencoding('utf8')

def main():
    if len(sys.argv) == 3 and os.path.isfile(sys.argv[1]) and os.path.isfile(sys.argv[2]):
        compare()
    else:
        print 'Please input regular files'

def compare():
    list1 = [line.rstrip() for line in open(sys.argv[1])]
    list2 = [line.rstrip() for line in open(sys.argv[2])]
    for i in range(len(list1)):
        os.system("composite -compose difference {0} {1} {2}".format(list1[i], list2[i], '/tmp/diff.png'))
        pipe = subprocess.Popen("identify -format %[mean] {0}".format('/tmp/diff.png'), shell=True, stdout=subprocess.PIPE).stdout
        value = pipe.read()
        if float(value) > 0:
            os.system("cp /tmp/diff.png {0}".format(str(i) + '.png'))
            print colored("{0}{1} Diff - {2}".format('[' + str(i) + ']', float(value), list2[i]), 'red')
        else:
            print colored("{0}{1} Same - {2}".format('[' + str(i) + ']', value, list2[i]), 'green')

if __name__ == "__main__":
    print 'Compare by python:'
    main()
#!/usr/bin/ruby

require 'colorize'

class CompareImages
  def initialize()
    puts "Compare by ruby:"
  end

  def read_put
    list1 = File.readlines(ARGV[0]).map{|x| x.strip}
    list2 = File.readlines(ARGV[1]).map{|x| x.strip}
    (0..list1.length - 1).each do |i|
      `composite -compose difference #{list1[i]} #{list2[i]} /tmp/diff.png`
      value = `identify -format %[mean] /tmp/diff.png`
      if value.to_i > 0
        `cp /tmp/diff.png #{i}.png`
        puts "[#{i}]#{value} Diff - #{list2[i]}".red
      else
        puts "[#{i}]#{value} Same - #{list2[i]}".green
      end
    end
  end
end

ci = CompareImages.new
ci.read_put
$ python ci.py old new
$ ruby ci.rb old new

It can also output the messages with color in the terminal!

imagemagick python ruby

用Python脚本导入文件中数据到MySQL数据库

#!/usr/bin/python
# coding: UTF-8

import sys
import os

reload(sys)
sys.setdefaultencoding('utf8')

def main():
    if len(sys.argv) == 2 and os.path.isdir(sys.argv[1]):
        make_file()
    else:
        print 'Please input a regular path'

def make_file():
    files = os.listdir(sys.argv[1])
    print files
    for file in files:
        if '.tsv' in file and '.swp' not in file:
            os.system("iconv -f SHIFT-JIS -t UTF-8 {0} > {1}".format(sys.argv[1] + "/" + file, 'tmp'))
            file_name = ''
            # Error Report
            if 'errors_report' in file:
                if 'English Error' in open('tmp').read():
                    file_name = 'errors_report_en'
                    os.system("mv tmp {0}".format(file_name))
                elif 'Japanese Error' in open('tmp').read():
                    file_name = 'errors_report_ja'
                    os.system("mv tmp {0}".format(file_name))
            # Warning Report
            if 'warnings_report' in file:
                file_name = 'warnings_report'
                os.system("mv tmp {0}".format(file_name))

            _import(file_name)
            _remove_tmp(file_name)

def _import(file_name):
    """
    Import to MySQL
    """
    os.system("mysqlimport -u root --local caption {0}".format(file_name))

def _remove_tmp(file_name):
    """
    Remove renamed file
    """
    os.system("rm {0}".format(file_name))

if __name__ == "__main__":
    main()

这个脚本可以把一个目录中的所有文件读入数据库,并且根据文件中数据不同,读入不同的表。例如数据库中有这样的表:

  • errorsreporten
  • errorsreportja
  • warnings_report

表中不用ID,完全和文件中的数据格式一致。对所有字段加uniq索引,确保读入的数据不重复插入。
此例目标文件为 .tsv 文件,编码不是UTF-8,要先转码。

脚本使用方法:

python import_to_db.py <path>

mysql python

在GAE上开发Python程序

创建APP

https://appengine.google.com/

安装SDK

$ wget http://googleappengine.googlecode.com/files/google_appengine_1.7.1.zip
$ unzip aoogle_appengine_1.7.1.zip

官方下载主页

https://developers.google.com/appengine/downloads

创建简单的应用

$ mkdir -p ~/workspace/gae_helloworld
$ mv google_appengine ~/workspace/gae_helloword/
$ cd ~/workspace/gae_helloworld

创建 app.yaml 文件(gae_helloworld的根目录下)

    application: helloworld
    version: 1
    runtime: python27
    api_version: 1
    threadsafe: true

    handlers:
    - url: /.*
      script: helloworld.application

创建 helloworld.py 文件(同样在根目录下)

def application(environ, start_response):
    """包含 environ, start_response 的函数
    """
    start_response('200 OK',[('Content-Type', 'text/plain')])
    return 'Hello, world!'

确认运行效果

$ google_appengine/dev_appserver.py .

# 指定端口
$ google_appengine/dev_appserver.py . --port=5000

发布

$ google_appengine/appcfg.py update .

访问 http://helloworld.appspot.com/ 确认运行效果。

参考资料

Pythonプロフェッショナルプログラミング第12章

google-app-engine python

在全新Ubuntu系统搭建Python开发环境

假如你在VirtualBox上安装了Ubuntu Server,用SSH连接的话,查看进程:

$ ps aux | grep sshd

验证SSH是否启动。

没有安装的话,执行下面命令:

$ sudo apt-get install ssh

设置语言

$ sudo locale-gen ja_JP.UTF-8
$ sudo dpkg-reconfigure locales

设置.bashrc中语言设定用的环境变量

export LANGUAGE=ja_JP:ja
export LANG=ja_JP.UTF-8

安装Python

安装apt包

$ sudo apt-get -y update
$ sudo apt-get -y upgrade
$ sudo apt-get -y install build-essential
$ sudo apt-get -y install libsqlite3-dev
$ sudo apt-get -y install libreadline6-dev
$ sudo apt-get -y install libgdbm-dev
$ sudo apt-get -y install zlib1g-dev
$ sudo apt-get -y install libbz2-dev
$ sudo apt-get -y install sqlite3
$ sudo apt-get -y install tk-dev
$ sudo apt-get -y install zip

安装Python相关的包

# 安装python-dev
$ sudo apt-get -y install python-dev

# 安装Distribute
$ sudo chmod -R 0775 /usr/local
$ sudo chgrp -R bpbook /usr/local
$ wget http://python-distribute.org/distribute_setup.py
$ sudo python distribute_setup.py

确认Python版本

$ python -V

安装easy_install和pip

$ wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py
$ sudo python get-pip.py

确认pip的版本

$ pip --version

安装virtualenv

$ pip install virtualenv

确认virtualenv的help

$ virtualenv --help

virtualenv的用法

确认安装的包

$ pip freeze

用virtualenv创建虚拟环境

$ export VIRTUALENV_USE_DISTRIBUTE=true
$ mkdir ~/work
$ cd ~/work
$ virtualenv env

执行虚拟环境

$ source env/bin/activate

确认虚拟环境的版本

(env) $ pip freeze
distribute==0.6.24
wsgiref==0.1.2

退出虚拟环境

(env) $ deactivate

virtualenvwrapper

安装virtualenvwrapper

$ pip install virtualenvwrapper

设置.bashrc

if [ -f /usr/local/bin/virtualenvwrapper.sh ]; then
  export WORKON_HOME=$HOME/.virtualenvs
  source /usr/local/bin/virtualenvwrapper.sh
fi

重新加载.bashrc

$ source ~/.bashrc

确认mkvirtualenv的帮助

$ mkvirtualenv --help

创建虚拟环境

$ mkvirtualenv newenv

确认虚拟环境

$ ls -la $HOME/.virtualenvs

执行虚拟环境

$ workon newenv

虚拟环境一览

$ workon

删除虚拟环境

$ rmvirtualenv newenv

适用指定版本的Python

$ mkvirtualenv --python=/usr/bin/python2.7 newenv2

确认虚拟环境的Python

(newenv2) $ python -v

设置Mercurial

安装Mercurial

$ sudo pip install mercurial

确认Mercurial

$ hg --version

创建库

设置.hgrc的用户名和邮件地址

[ui]
username=bpbook <bpbook@beproud.jp>
editor = vim

[extensions]
graphlog=
color=
pager=

[pager]
pager=LESS='FSRX' less

hg init(初始化库)

$ mkdir ~/hgtest
cd ~/hgtest
hg init

操作文件

hg status(确认状态)

$ touch test.txt
$ hg status

? test.txt

hg add(添加文件)

$ hg add test.txt
$ hg status

A test.txt

hg commit(提交)

$ hg commit 

test commit

...

确认编辑器

$ echo $EDITOR
vim

hg diff(确认区分)

$ hg status
M test.txt

$ hg diff

diff -r 23742103740 test.txt
...

hg revert(取消编辑)

$ hg revert test.txt

hg rollback(取消提交)

$ hg rollback
$ hg status

A test.txt

hg log(确认提交日志)

$ hg log

编辑器

Vim

编辑.vimrc

Python用的设定

syntax on
filetype plugin indent on

准备配置Python的文件

$ mkdir ~/.vim
$ mkdir ~/.vim/ftplugin
$ touch ~/.vim/ftplugin/python.vim

把下面的内容添加到python.vim文件中

setl expandtab
setl tabstop=4
setl shiftwidth=4
setl softtabstop=4

行尾空格删除的设置

autocmd BufWritePre * :%s/\s\+$//ge

一行80文字换行的设置

setlocal textwidth=80

Emacs

创建.emacs文件

$ touch ~/.emacs

缩进空格

(setq indent-tabs-mode nil)
(global-font-lock-mode 1)

Good Wiki

开发Python的Tips

感应模式

$ python
>>> import sys
>>> sys.path
...

安装IPython

$ pip install IPython

安装pep8(编码样式检查)

$ pip install pep8

安装pyflakes(语法检查/模型)

$ pip install pyflakes

pdb(调试器)

添加pdb的代码

def add(x,y)
  return x + y

x = 0
import pdb; pdb.set_trace()
x = add(1, 2)

执行pdb的例子

$ python pdbtest.py
> pdbtest.py(7)<module>()
-> x = add(1, 2)
(Pdb)

The Zen of Python

>>> import this

参考资料

Pythonプロフェッショナルプログラミング第1章

emacs mercurial python ubuntu vim