Qiang Blog

Just another zhangjingqiang's blog.

Architectural quality attributes

  • Modifiability
  • Testability
  • Scalability
  • Performance
  • Availability
  • Security
  • Deployability

Software Architecture with Python

architecture

Hadoop Ecosystem

Storing & Querying

  • Apache Hive
  • Apache HBase

Bulk Transferring & Streaming

  • Apache Sqoop
  • Apache Flume

Serializing

  • Apache Avro
  • Apache Parquet

Messaging & Indexing

  • Apache Kafka
  • Apache Solr
  • Apache Mahout

Practical Hadoop Ecosystem

big-data ecosystem hadoop

How To Deploy Rails Apps Using Unicorn And Nginx on CentOS 7.3

Set iptables

yum install iptables-services
iptables -I INPUT -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
service iptables save

Install git and nginx

yum install -y epel-release git nginx

Change nginx default setting file

/etc/nginx/nginx.conf

user root;
...

#    server {
#        listen       80 default_server;
#        listen       [::]:80 default_server;
#        server_name  _;
#        root         /usr/share/nginx/html;
#
#        # Load configuration files for the default server block.
#        include /etc/nginx/default.d/*.conf;
#
#        location / {
#        }
#
#        error_page 404 /404.html;
#            location = /40x.html {
#        }
#
#        error_page 500 502 503 504 /50x.html;
#            location = /50x.html {
#        }
#   }

Create rails.conf under /etc/nginx/conf.d/

upstream app {
    # Path to Unicorn SOCK file, as defined previously
    server unix:/home/deploy/myapp/tmp/sockets/unicorn.sock fail_timeout=0;
}

server {
    listen 80;
    server_name node1;

    # Application root, as defined previously
    root /home/deploy/myapp/public;

    try_files $uri/index.html $uri @app;

    location @app {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_redirect off;
        proxy_set_header X-FORWARDED_PROTO https;
        proxy_pass http://app;
    }

    error_page 500 502 503 504 /500.html;
    client_max_body_size 4G;
    keepalive_timeout 10;
}

Check syntax and restart Nginx

nginx -t
systemctl restart nginx

Install rbenv

git clone https://github.com/rbenv/rbenv.git ~/.rbenv
git clone https://github.com/rbenv/ruby-build.git ~/.rbenv/plugins/ruby-build
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bash_profile
~/.rbenv/bin/rbenv init

Switch ruby

rbenv install 2.3.1
rbenv global 2.3.1
rbenv rehash

Install bundler, rails and unicorn

gem install bundler
gem install rails -v 5.0.0
gem install unicorn

Change to myapp path

cd /home/deploy/myapp

Setup

bundle install
RAILS_ENV=production bundle exec rails assets:precompile
RAILS_ENV=production bundle exec rails db:migrate

Start unicorn

bundle exec unicorn_rails -c config/unicorn.rb -D -E production

centos nginx rails unicorn

How to create MongoDB replica set on CentOS 7

Nodes

  • node1
  • node2

/etc/hosts

192.168.0.1              node1
192.168.0.2              node2

Install

On both nodes

vi /etc/yum.repos.d/mongodb-org.repo
[mongodb-org-3.2]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.2/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-3.2.asc

yum install mongodb-org

Create keyfile

On both nodes

openssl rand -base64 756 > /root/keyfile
chmod 400 /root/keyfile

Edit config file

On both nodes

service mongodb stop
mkdir /mongo-metadata

vi /etc/mongod.conf
storage:
  dbPath: /mongo-metadata
  ...
net:
  port: 27017
security:
  keyFile: /root/keyfile
replication:
  replSetName: rs0

mongod --config /etc/mongod.conf

Start replica set and add node

On node1

mongo
rs.initiate()
use admin
db.createUser(
  {
    user: "admin",
    pwd: "password",
    roles: [ { role: "root", db: "admin" } ]
  }
);
db.auth('admin', 'password')
1
rs.add("node2")
{ "ok" : 1 }
rs.conf()
{
        "_id" : "rs0",
        "version" : 2,
        "protocolVersion" : NumberLong(1),
        "members" : [
                {
                        "_id" : 0,
                        "host" : "node1:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 1,
                        "tags" : {

                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                },
                {
                        "_id" : 1,
                        "host" : "node2:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 1,
                        "tags" : {

                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                }
        ],
        "settings" : {
                "chainingAllowed" : true,
                "heartbeatIntervalMillis" : 2000,
                "heartbeatTimeoutSecs" : 10,
                "electionTimeoutMillis" : 10000,
                "getLastErrorModes" : {

                },
                "getLastErrorDefaults" : {
                        "w" : 1,
                        "wtimeout" : 0
                },
                "replicaSetId" : ObjectId("596ee39369296db40861b82f")
        }
}

Official

centos mongodb

Amazon Web Services LiveLessons

These contents are very useful.

The Ideal

  • Hightly Available
  • Fault Tolerant
  • Secure
  • Durable
  • Self Healing
  • Automated
  • Cost Effective

Best Practices

  • Design for Failure
  • Scale Horizontally
  • Disposable Resources over Fixed Servers
  • Automate, Automate, Automate!
  • Security in Layers
  • Loose Coupling
  • Optimize for Cost

amazon-web-services

How to use serverspec to test servers?

Start a test project

$ serverspec-init

Directory

.
├── README.md
├── Rakefile
├── properties.yml
└── spec
    ├── app
    │   └── ruby_spec.rb
    ├── base
    │   ├── host_spec.rb
    │   └── users_and_groups_spec.rb
    ├── db
    │   └── mysql_spec.rb
    ├── proxy
    │   └── nginx_spec.rb
    ├── spec_helper.rb
    └── worker
        └── redis_spec.rb

Source

Rakefile

require 'rake'
require 'rspec/core/rake_task'
require 'yaml'

properties = YAML.load_file('properties.yml')

task :spec    => 'serverspec:all'
task :default => :spec

namespace :serverspec do
  task :all => properties.keys.map {|key| 'serverspec:' + key.split('.')[0] }
  properties.keys.each do |key|
    desc "Run serverspec to #{key}"
    RSpec::Core::RakeTask.new(key.split('.')[0].to_sym) do |t|
      ENV['TARGET_HOST'] = key
      t.pattern = 'spec/{' + properties[key][:roles].join(',') + '}/*_spec.rb'
    end
  end
end

properties.yml

# server1

app.server1:
  :roles:
    - base
    - proxy
db.server1:
  :roles:
    - db
worker.server1:
  :roles:
    - app
    - worker

# server2

app.server2:
  :roles:
    - base
    - proxy
db.server2:
  :roles:
    - db
worker.server2:
  :roles:
    - app
    - worker

spec/spec_helper.rb

require 'serverspec'
require 'net/ssh'
require 'yaml'

properties = YAML.load_file('properties.yml')

set :backend, :ssh
set :request_pty, true

if ENV['ASK_SUDO_PASSWORD']
  begin
    require 'highline/import'
  rescue LoadError
    fail "highline is not available. Try installing it."
  end
  set :sudo_password, ask("Enter sudo password: ") { |q| q.echo = false }
else
  set :sudo_password, ENV['SUDO_PASSWORD']
end

host = ENV['TARGET_HOST']
set_property properties[host]

options = Net::SSH::Config.for(host)

options[:user] ||= Etc.getlogin

set :host,        options[:host_name] || host
set :ssh_options, options

# Disable sudo
# set :disable_sudo, true


# Set environment variables
# set :env, :LANG => 'C', :LC_MESSAGES => 'C'

# Set PATH
# set :path, '/sbin:/usr/local/sbin:$PATH'

spec/app/ruby_spec.rb

require_relative '../spec_helper'

describe process("ruby") do
  it { should be_running }
end

spec/base/host_spec.rb

require_relative '../spec_helper'

shared_examples "cpu and memory should be ok" do
  it { should be_resolvable }
  it { should be_resolvable.by('dns') }

  it 'CPU should eq 2' do
    expect(host_inventory['cpu']['total']).to eq('2')
  end

  it 'Memory should > 7000000kB' do
    expect(host_inventory['memory']['total']).to be > '7000000kB'
  end
end

# server1

describe host('www.server1.com') do
  include_examples "cpu and memory should be ok"
end

# server2

describe host('www.server2.com') do
  include_examples "cpu and memory should be ok"
end

spec/db/mysql_spec.rb

require_relative '../spec_helper'

describe 'MySQL config parameters' do
  context mysql_config('innodb-buffer-pool-size') do
    its(:value) { should > 100000000 }
  end

  context mysql_config('socket') do
    its(:value) { should eq '/var/lib/mysql/mysql.sock' }
  end
end

spec/proxy/nginx_spec.rb

require_relative '../spec_helper'

describe port(80) do
  it { should be_listening }
end

describe port(80) do
  it { should be_listening.with('tcp') }
end

describe process("nginx") do
  it { should be_running }
end

spec/worker/redis_spec.rb

require_relative '../spec_helper'

describe process("redis") do
  it { should be_running }
end

Run test cases

$ rake spec

More resource types

http://serverspec.org/resource_types.html

serverspec

How to bulk set multiple servers use different colors with ansible?

2017年1月Gitlab的数据库误删除事件使全世界对服务器的安全重视起来,把不同的服务器设置成不同的颜色背景是一个较有效的方法。下面使用 Ansible 设置 Tmux 的 powerline 区分不同环境的服务器。

Directory

.
├── README.md
└── provisioning
    ├── files
    │   └── .zshrc.yml
    ├── inventory
    ├── playbook.yml
    ├── tasks
    │   ├── tmux.yml
    │   └── zsh.yml
    └── templates
        └── .tmux.conf.j2

How to use?

$ cd provisioning
$ ansible-playbook playbook.yml -i inventory

Source

provisioning/files/.zshrc

if [ "$TMUX" = ""   ]; then tmux; fi

provisioning/templates/.tmux.conf.j2

source-file "/home/{{username}}/.tmux-themepack/powerline/block/{{color}}.tmuxtheme"

provisioning/tasks/tmux.yml

---
- name: Install the latest version of Tmux
  yum: name=tmux state=latest

- name: Install tmux-thmepack
  git: repo=https://github.com/jimeh/tmux-themepack
       dest=/home/{{username}}/.tmux-themepack

- name: Copy .tmux.conf file to servers
  template:
    src: templates/.tmux.conf.j2
    dest: /home/{{username}}/.tmux.conf

provisioning/tasks/zsh.yml

---
- name: Install the latest version of Zsh
  yum: name=zsh state=latest

- name: Copy .zshrc file to servers
  copy:
    src: files/.zshrc
    dest: /home/{{username}}/.zshrc

- name: Start zsh shell
  user: name={{username}} shell=/bin/zsh

provisioning/playbook.yml

---
- hosts: all
  become: yes
  vars_prompt:
    name: "username"
    prompt: "Enter username"
    private: no

  tasks:
    - include: tasks/tmux.yml
    - include: tasks/zsh.yml

provisioning/inventory

[server1]
app1.server1
app2.server1

[server1:vars]
color=blue

[server2]
app1.server2
app2.server2

[server2:vars]
color=orange

[server3]
app1.server3
app2.server3

[server3:vars]
color=red

[servers:children]
server1
server2
server3

ansible tmux zsh

使用Google API自动把本地文件内容写入Google Spreadsheet

需要使用的Google API

  • Google Drive API
  • Google Spreadsheet API

实现的功能

  • 获取Google认证权限
  • 在Google Drive创建Google Spreadsheet
  • 分享给用户,域名功能
  • 删除原有sheet1并创建新的固定sheet
  • 向Google Spreadsheet循环创建并写入本地指定路径下所有文件内容
  • 设置sheet头部样式(标题,颜色,固定,加粗等)

此例文件类型

  • tsv

使用方法

Step 1

从Google开发者页面下载 client_secret.json
参考:

https://developers.google.com/sheets/api/quickstart/python

Step 2

在terminal运行:

$ python write_to_google_sheets.py <file_name> <folder_path>

精彩代码

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import print_function
import httplib2
import os
import sys
import json
import re
from termcolor import colored

from apiclient import discovery
from oauth2client import client
from oauth2client import tools
from oauth2client.file import Storage
import requests

reload(sys)
sys.setdefaultencoding('utf8')

# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/sheets.googleapis.com-python-quickstart.json
SCOPES = "https://www.googleapis.com/auth/drive https://www.googleapis.com/auth/spreadsheets"
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Google API Drive + Spreadsheet'

def get_credentials():
    """Gets valid user credentials from storage.

    If nothing has been stored, or if the stored credentials are invalid,
    the OAuth2 flow is completed to obtain the new credentials.

    Returns:
        Credentials, the obtained credential.
    """
    home_dir = os.path.expanduser('~')
    credential_dir = os.path.join(home_dir, '.credentials')
    if not os.path.exists(credential_dir):
        os.makedirs(credential_dir)
    credential_path = os.path.join(credential_dir,
                                   'sheets.googleapis.com-python-quickstart.json')

    store = Storage(credential_path)
    credentials = store.get()
    if not credentials or credentials.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        flow.user_agent = APPLICATION_NAME
        if flags:
            credentials = tools.run_flow(flow, store, flags)
        else: # Needed only for compatibility with Python 2.6
            credentials = tools.run(flow, store)
        print('Storing credentials to ' + credential_path)
    return credentials

def main():
    """Create spreadsheet and write data into it
    Use:
    - Google Drive API
    - Google Spreadsheet API
    """
    credentials = get_credentials()
    http = credentials.authorize(httplib2.Http())

    # Create new spreadsheet
    spreadsheet_id = create_spreadsheet(http)

    # Create new sheet
    write_data_to_sheets(http, spreadsheet_id)

    print(colored('Finish!', 'green'))

def create_spreadsheet(http):
    drive_service = discovery.build('drive', 'v3', http=http)
    file_metadata = {
      'name' : sys.argv[1],
      'mimeType' : 'application/vnd.google-apps.spreadsheet'
    }
    file = drive_service.files().create(body=file_metadata, fields='id').execute()
    # share spreadsheet
    share(drive_service, file.get('id'))
    return file.get('id')

def write_data_to_sheets(http, spreadsheet_id):
    discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?'
                    'version=v4')
    spreadsheet_service = discovery.build('sheets', 'v4', http=http,
                              discoveryServiceUrl=discoveryUrl)

    files = os.listdir(sys.argv[2])
    print(colored(files, 'red'))

    # create new sheets and delete default sheet1
    spreadsheet_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body=base_sheet()).execute()
    # write log files to sheets
    for file in files:
        write_log_to_sheet(spreadsheet_service, spreadsheet_id, file)

def base_sheet():
    data = {
      'requests': [
        {
          'addSheet':{
            'properties': {'title': u'New Sheet Name'}
          }
        },
        {
          'deleteSheet':{
            'sheetId': 0
          }
        }
      ]
    }
    return data

def write_log_to_sheet(spreadsheet_service, spreadsheet_id, file):
    # make sheet
    body = {
      'requests': [
        {
          'addSheet':{
            'properties': {'title': u'{0}'.format(file)}
          }
        }
      ]
    }
    result = spreadsheet_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body=body).execute()
    sheetId = result['replies'][0]['addSheet']['properties']['sheetId']

    # write to sheet
    values = []
    values.append(titles(file))
    lines = [line.rstrip() for line in open(file)]
    for i in range(len(lines)):
        values.append(re.split(r'\t+', lines[i]))
    data = [
            {
                'range': '{0}!A1'.format(file),
                'values': values
            }
    ]
    body = {
            'valueInputOption': 'USER_ENTERED',
            'data': data
    }
    spreadsheet_service.spreadsheets().values().batchUpdate(spreadsheetId=spreadsheet_id, body=body).execute()

    # format header row
    body = {
    "requests": [
    {
      "repeatCell": {
        "range": {
          "sheetId": sheetId,
          "startRowIndex": 0,
          "endRowIndex": 1
        },
        "cell": {
          "userEnteredFormat": {
            "backgroundColor": {
              "red": 0.0,
              "green": 0.0,
              "blue": 1.0
            },
            "horizontalAlignment" : "LEFT",
            "textFormat": {
              "foregroundColor": {
                "red": 1.0,
                "green": 1.0,
                "blue": 1.0
              },
              "fontSize": 12,
              "bold": 'true'
            }
          }
        },
        "fields": "userEnteredFormat(backgroundColor,textFormat,horizontalAlignment)"
      }
    },
    {
      "updateSheetProperties": {
        "properties": {
          "sheetId": sheetId,
          "gridProperties": {
            "frozenRowCount": 1
          }
        },
        "fields": "gridProperties.frozenRowCount"
      }
    }
    ]
    }
    spreadsheet_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body=body).execute()

def titles(file):
    values_title = [
        'ID', 'Name', 'Description'
    ]
    return values_title

def share(drive_service, spreadsheet_id):
    batch = drive_service.new_batch_http_request(callback=callback)
    # share to users
    user_permission = [
        'user1@example.com',
        'user2@example.com'
    ]
    for user in user_permission:
        share_user(drive_service, spreadsheet_id, batch, user)
    # share to domains
    domain_permission = [
        'google.com',
        'facebook.com'
    ]
    for domain in domain_permission:
        share_domain(drive_service, spreadsheet_id, batch, domain)
    # batch execute
    batch.execute()

def share_user(drive_service, spreadsheet_id, batch, user):
    user_permission = {
            'type': 'user',
            'role': 'writer',
            'emailAddress': user
            }
    batch.add(drive_service.permissions().create(
        fileId=spreadsheet_id,
        body=user_permission,
        fields='id',
        ))
    return batch

def share_domain(drive_service, spreadsheet_id, batch, domain):
    domain_permission = {
            'type': 'domain',
            'role': 'reader',
            'domain': domain
            }
    batch.add(drive_service.permissions().create(
        fileId=spreadsheet_id,
        body=domain_permission,
        fields='id',
        ))
    return batch

def callback(request_id, response, exception):
    if exception:
        print(exception)
    else:
        print("Permission Id: {0}".format(response.get('id')))

if __name__ == '__main__':
    if len(sys.argv) == 3 and os.path.isdir(sys.argv[2]):
        main()
    else:
        print('Please input a file name and log path.')

google-drive-api google-spreadsheet-api python

How to bulk make json list with ruby?

File example:

# text_list
text1
text2
text3

Ruby batch:

#!/usr/bin/ruby

require 'json'

class Maker
  def initialize(counter=0)
    @counter = counter
    case counter
    when 0
      @position = 'bottom-center'
      @type = 'horizontal'
    when 1
      @position = 'vertical-right'
      @type = 'vertical'
    end
  end

  def read_write
    File.open('text_list', 'r') do |fr|
      while(line = fr.gets)
        json = json_format(line.strip, @counter)
        @counter = @counter + 1
        p JSON.generate(json)
        File.open('new_json', 'a') { |fw| fw.write(JSON.generate(json).to_s + ',') }
      end
    end
  end

  def json_format(line, counter)
    {
      "id":"#{counter}",
      "position": "#{@position}",
      "type": "#{@type}",
      "text":"#{line}"
    }
  end
end

if ARGV.empty?
  p 'Input a value, please.'
  p 'For example:'
  p 'Horizontal -- ruby mjl.rb 0'
  p 'Vertical -- ruby mjl.rb 1'
  exit
end

if ARGV.length > 1
  p 'Please input one value only.'
  exit
end

if not [0, 1].include?(ARGV[0].to_i)
  p 'Please input 0 or 1'
  exit
end

m = Maker.new ARGV[0].to_i
m.read_write

Then use https://jsonformatter.curiousconcept.com format the json list.

json ruby

How to bulk compare images by imagemagick with python and ruby?

If we want to compare all images in two different path, we can save them into two files and bulk compare them by batch script.

For example:

# old
1.png
2.png
# new
1.png
2.png( - Different image)

Then run the python or ruby script:

#!/usr/bin/python
# coding: UTF-8

import os
import sys
import subprocess
from termcolor import colored

reload(sys)
sys.setdefaultencoding('utf8')

def main():
    if len(sys.argv) == 3 and os.path.isfile(sys.argv[1]) and os.path.isfile(sys.argv[2]):
        compare()
    else:
        print 'Please input regular files'

def compare():
    list1 = [line.rstrip() for line in open(sys.argv[1])]
    list2 = [line.rstrip() for line in open(sys.argv[2])]
    for i in range(len(list1)):
        os.system("composite -compose difference {0} {1} {2}".format(list1[i], list2[i], '/tmp/diff.png'))
        pipe = subprocess.Popen("identify -format %[mean] {0}".format('/tmp/diff.png'), shell=True, stdout=subprocess.PIPE).stdout
        value = pipe.read()
        if float(value) > 0:
            os.system("cp /tmp/diff.png {0}".format(str(i) + '.png'))
            print colored("{0}{1} Diff - {2}".format('[' + str(i) + ']', float(value), list2[i]), 'red')
        else:
            print colored("{0}{1} Same - {2}".format('[' + str(i) + ']', value, list2[i]), 'green')

if __name__ == "__main__":
    print 'Compare by python:'
    main()
#!/usr/bin/ruby

require 'colorize'

class CompareImages
  def initialize()
    puts "Compare by ruby:"
  end

  def read_put
    list1 = File.readlines(ARGV[0]).map{|x| x.strip}
    list2 = File.readlines(ARGV[1]).map{|x| x.strip}
    (0..list1.length - 1).each do |i|
      `composite -compose difference #{list1[i]} #{list2[i]} /tmp/diff.png`
      value = `identify -format %[mean] /tmp/diff.png`
      if value.to_i > 0
        `cp /tmp/diff.png #{i}.png`
        puts "[#{i}]#{value} Diff - #{list2[i]}".red
      else
        puts "[#{i}]#{value} Same - #{list2[i]}".green
      end
    end
  end
end

ci = CompareImages.new
ci.read_put
$ python ci.py old new
$ ruby ci.rb old new

It can also output the messages with color in the terminal!

imagemagick python ruby