Python 语法速览与实战清单
注:鉴于博客内容有限与格式错乱,原始 MarkDown 文档存放在了附件中。
Python 语法速览与实战清单
本文是对于 现代 Python 开发:语法基础与工程实践的总结,更多 Python 相关资料参考 Python 学习与实践资料索引;本文参考了 Python Crash Course - Cheat Sheets,pysheeet 等。本文仅包含笔者在日常工作中经常使用的,并且认为较为关键的知识点与语法,如果想要进一步学习 Python 相关内容或者对于机器学习与数据挖掘方向感兴趣,可以参考程序猿的数据科学与机器学习实战手册。
基础语法
Python 是一门高阶、动态类型的多范式编程语言;定义 Python 文件的时候我们往往会先声明文件编码方式 :
# 指定脚本调用方式#!/usr/bin/env python# 配置 utf-8 编码# -*- coding: utf-8 -*-# 配置其他编码# -*- coding: <encoding-name> -*-# Vim 中还可以使用如下方式# vim:fileencoding=<encoding-name>
人生苦短,请用 Python,大量功能强大的语法糖的同时让很多时候 Python 代码看上去有点像伪代码。譬如我们用 Python 实现的简易的快排相较于 Java 会显得很短小精悍 :
def quicksort(arr): if len(arr) <= 1: return arr pivot = arr[len(arr) / 2] left = [x for x in arr if x < pivot] middle = [x for x in arr if x == pivot] right = [x for x in arr if x > pivot] return quicksort(left) + middle + quicksort(right)print quicksort([3,6,8,10,1,2,1])# Prints "[1, 1, 2, 3, 6, 8, 10]"
控制台交互
可以根据 __name__
关键字来判断是否是直接使用 python 命令执行某个脚本,还是外部引用;Google 开源的 fire 也是不错的快速将某个类封装为命令行工具的框架:
import fireclass Calculator(object): """A simple calculator class.""" def double(self, number): return 2 * numberif __name__ == '__main__': fire.Fire(Calculator)# python calculator.py double 10 # 20# python calculator.py double --number=15 # 30
Python 2 中 print 是表达式,而 Python 3 中 print 是函数;如果希望在 Python 2 中将 print 以函数方式使用,则需要自定义引入 :
from __future__ import print_function
我们也可以使用 pprint 来美化控制台输出内容:
import pprint stuff = ['spam', 'eggs', 'lumberjack', 'knights', 'ni'] pprint.pprint(stuff)# 自定义参数pp = pprint.PrettyPrinter(depth=6) tup = ('spam', ('eggs', ('lumberjack', ('knights', ('ni', ('dead',('parrot', ('fresh fruit',)))))))) pp.pprint(tup)
模块
Python 中的模块(Module )即是 Python 源码文件,其可以导出类、函数与全局变量;当我们从某个模块导入变量时,函数名往往就是命名空间(Namespace )。而 Python 中的包(Package )则是模块的文件夹,往往由 __init__.py
指明某个文件夹为包 :
# 文件目录someDir/ main.py siblingModule.py# siblingModule.pydef siblingModuleFun(): print('Hello from siblingModuleFun')def siblingModuleFunTwo(): print('Hello from siblingModuleFunTwo')import siblingModuleimport siblingModule as sibMod sibMod.siblingModuleFun()from siblingModule import siblingModuleFun siblingModuleFun()try: # Import 'someModuleA' that is only available in Windows import someModuleAexcept ImportError: try: # Import 'someModuleB' that is only available in Linux import someModuleB except ImportError:
Package 可以为某个目录下所有的文件设置统一入口 :
someDir/ main.py subModules/ __init__.py subA.py subSubModules/ __init__.py subSubA.py# subA.pydef subAFun(): print('Hello from subAFun')def subAFunTwo(): print('Hello from subAFunTwo')# subSubA.pydef subSubAFun(): print('Hello from subSubAFun')def subSubAFunTwo(): print('Hello from subSubAFunTwo')# __init__.py from subDir# Adds 'subAFun()' and 'subAFunTwo()' to the 'subDir' namespacefrom .subA import *# The following two import statement do the same thing, they add 'subSubAFun()' and 'subSubAFunTwo()' to the 'subDir' namespace. The first one assumes '__init__.py' is empty in 'subSubDir', and the second one, assumes '__init__.py' in 'subSubDir' contains 'from .subSubA import *'.# Assumes '__init__.py' is empty in 'subSubDir'# Adds 'subSubAFun()' and 'subSubAFunTwo()' to the 'subDir' namespacefrom .subSubDir.subSubA import *# Assumes '__init__.py' in 'subSubDir' has 'from .subSubA import *'# Adds 'subSubAFun()' and 'subSubAFunTwo()' to the 'subDir' namespacefrom .subSubDir import *# __init__.py from subSubDir# Adds 'subSubAFun()' and 'subSubAFunTwo()' to the 'subSubDir' namespacefrom .subSubA import *# main.pyimport subDir subDir.subAFun() # Hello from subAFunsubDir.subAFunTwo() # Hello from subAFunTwosubDir.subSubAFun() # Hello from subSubAFunsubDir.subSubAFunTwo() # Hello from subSubAFunTwo
表达式与控制流
条件选择
Python 中使用 if、elif 、 else 来进行基础的条件选择操作:
if x < 0: x = 0 print('Negative changed to zero') elif x == 0: print('Zero') else: print('More')
Python 同样支持 ternary conditional operator:
a if condition else b
也可以使用 Tuple 来实现类似的效果:
# test 需要返回 True 或者 False(falseValue, trueValue)[test]# 更安全的做法是进行强制判断(falseValue, trueValue)[test == True]# 或者使用 bool 类型转换函数(falseValue, trueValue)[bool(<expression>)]
循环遍历
for-in 可以用来遍历数组与字典:
words = ['cat', 'window', 'defenestrate']for w in words: print(w, len(w))# 使用数组访问操作符,能够迅速地生成数组的副本for w in words[:]: if len(w) > 6: words.insert(0, w)# words -> ['defenestrate', 'cat', 'window', 'defenestrate']
如果我们希望使用数字序列进行遍历,可以使用 Python 内置的 range
函数:
a = ['Mary', 'had', 'a', 'little', 'lamb']for i in range(len(a)): print(i, a[i])
基本数据类型
可以使用内建函数进行强制类型转换(Casting ) :
int(str)float(str)str(int)str(float)
isinstance 方法用于判断某个对象是否源自某个类 :
ex = 10# 判断是否为 int 类型isinstance(ex,int)# isinstance 也支持同时判断多个类型# 如下代码判断是否为数组def is_array(var): return isinstance(var, (list, tuple))
Number: 数值类型
x = 3print type(x) # Prints "<type 'int'>"print x # Prints "3"print x + 1 # Addition; prints "4"print x - 1 # Subtraction; prints "2"print x * 2 # Multiplication; prints "6"print x ** 2 # Exponentiation; prints "9"x += 1print x # Prints "4"x *= 2print x # Prints "8"y = 2.5print type(y) # Prints "<type 'float'>"print y, y + 1, y * 2, y ** 2 # Prints "2.5 3.5 5.0 6.25"
布尔类型
Python 提供了常见的逻辑操作符,不过需要注意的是 Python 中并没有使用 &&、|| 等,而是直接使用了英文单词。
t = Truef = Falseprint type(t) # Prints "<type 'bool'>"print t and f # Logical AND; prints "False"print t or f # Logical OR; prints "True"print not t # Logical NOT; prints "False"print t != f # Logical XOR; prints "True"
String: 字符串
Python 2 中支持 Ascii 码的 str() 类型,独立的 unicode() 类型,没有 byte 类型;而 Python 3 中默认的字符串为 utf-8 类型,并且包含了 byte 与 bytearray 两个字节类型:
type("Guido") # string type is str in python2# <type 'str'># 使用 __future__ 中提供的模块来降级使用 Unicodefrom __future__ import unicode_literalstype("Guido") # string type become unicode# <type 'unicode'>
Python 字符串支持分片、模板字符串等常见操作 :
var1 = 'Hello World!'var2 = "Python Programming"print "var1[0]: ", var1[0]print "var2[1:5]: ", var2[1:5]# var1[0]: H# var2[1:5]: ythoprint "My name is %s and weight is %d kg!" % ('Zara', 21)# My name is Zara and weight is 21 kg!
str[0:4]len(str) string.replace("-", " ")",".join(list)"hi {0}".format('j')str.find(",")str.index(",") # same, but raises IndexErrorstr.count(",")str.split(",")str.lower()str.upper()str.title()str.lstrip()str.rstrip()str.strip()str.islower()
# 移除所有的特殊字符re.sub('[^A-Za-z0-9]+', '', mystring)
如果需要判断是否包含某个子字符串,或者搜索某个字符串的下标 :
# in 操作符可以判断字符串if "blah" not in somestring: continue# find 可以搜索下标s = "This be a string"if s.find("is") == -1: print "No 'is' here!"else: print "Found 'is' in the string."
Regex: 正则表达式
import re# 判断是否匹配re.match(r'^[aeiou]', str)# 以第二个参数指定的字符替换原字符串中内容re.sub(r'^[aeiou]', '?', str) re.sub(r'(xyz)', r'\1', str)# 编译生成独立的正则表达式对象expr = re.compile(r'^...$') expr.match(...) expr.sub(...)
下面列举了常见的表达式使用场景 :
# 检测是否为 HTML 标签re.search('<[^/>][^>]*>', '<a href="#label">')# 常见的用户名密码re.match('^[a-zA-Z0-9-_]{3,16}$', 'Foo') is not Nonere.match('^\w|[-_]{3,16}$', 'Foo') is not None# Emailre.match('^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$', 'hello.world@example.com')# Urlexp = re.compile(r'''^(https?:\/\/)? # match http or https ([\da-z\.-]+) # match domain \.([a-z\.]{2,6}) # match domain ([\/\w \.-]*)\/?$ # match api or file ''', re.X) exp.match('www.google.com')# IP 地址exp = re.compile(r'''^(?:(?:25[0-5] |2[0-4][0-9] |[1]?[0-9][0-9]?)\.){3} (?:25[0-5] |2[0-4][0-9] |[1]?[0-9][0-9]?)$''', re.X) exp.match('192.168.1.1')
集合类型
List: 列表
Operation: 创建增删
list 是基础的序列类型:
l = [] l = list()# 使用字符串的 split 方法,可以将字符串转化为列表str.split(".")# 如果需要将数组拼装为字符串,则可以使用 joinlist1 = ['1', '2', '3'] str1 = ''.join(list1)# 如果是数值数组,则需要先进行转换list1 = [1, 2, 3] str1 = ''.join(str(e) for e in list1)
可以使用 append 与 extend 向数组中插入元素或者进行数组连接
x = [1, 2, 3] x.append([4, 5]) # [1, 2, 3, [4, 5]]x.extend([4, 5]) # [1, 2, 3, 4, 5],注意 extend 返回值为 None
可以使用 pop、slices 、 del、remove 等移除列表中元素:
myList = [10,20,30,40,50]# 弹出第二个元素myList.pop(1) # 20# myList: myList.pop(1)# 如果不加任何参数,则默认弹出最后一个元素myList.pop()# 使用 slices 来删除某个元素a = [ 1, 2, 3, 4, 5, 6 ] index = 3 # Only Positive indexa = a[:index] + a[index+1 :]# 根据下标删除元素myList = [10,20,30,40,50] rmovIndxNo = 3del myList[rmovIndxNo] # myList: [10, 20, 30, 50]# 使用 remove 方法,直接根据元素删除letters = ["a", "b", "c", "d", "e"] numbers.remove(numbers[1])print(*letters) # used a * to make it unpack you don't have to
Iteration: 索引遍历
你可以使用基本的 for 循环来遍历数组中的元素,就像下面介个样纸 :
animals = ['cat', 'dog', 'monk ey']for animal in animals: print animal# Prints "cat", "dog", "monk ey", each on its own line.
如果你在循环的同时也希望能够获取到当前元素下标,可以使用 enumerate 函数 :
animals = ['cat', 'dog', 'monk ey']for idx, animal in enumerate(animals): print '#%d: %s' % (idx + 1, animal)# Prints "#1: cat", "#2: dog", "#3: monk ey", each on its own line
Python 也支持切片(Slices ) :
nums = range(5) # range is a built-in function that creates a list of integersprint nums # Prints "[0, 1, 2, 3, 4]"print nums[2:4] # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"print nums[2:] # Get a slice from index 2 to the end; prints "[2, 3, 4]"print nums[:2] # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"print nums[:] # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"print nums[:-1] # Slice indices can be negative; prints ["0, 1, 2, 3]"nums[2:4] = [8, 9] # Assign a new sublist to a sliceprint nums # Prints "[0, 1, 8, 9, 4]"
Comprehensions: 变换
Python 中同样可以使用 map、reduce 、 filter,map 用于变换数组 :
# 使用 map 对数组中的每个元素计算平方items = [1, 2, 3, 4, 5] squared = list(map(lambda x: x**2, items))# map 支持函数以数组方式连接使用def multiply(x): return (x*x)def add(x): return (x+x) funcs = [multiply, add]for i in range(5): value = list(map(lambda x: x(i), funcs)) print(value)
reduce 用于进行归纳计算 :
# reduce 将数组中的值进行归纳from functools import reduceproduct = reduce((lambda x, y: x * y), [1, 2, 3, 4])# Output: 24
filter 则可以对数组进行过滤 :
number_list = range(-5, 5) less_than_zero = list(filter(lambda x: x < 0, number_list))print(less_than_zero)# Output: [-5, -4, -3, -2, -1]
字典类型
创建增删
d = {'cat': 'cute', 'dog': 'furry'} # 创建新的字典print d['cat'] # 字典不支持点(Dot)运算符取值
如果需要合并两个或者多个字典类型:
# python 3.5z = {**x, **y}# python 2.7def merge_dicts(*dict_args): """ Given any number of dicts, shallow copy and merge into a new dict, precedence goes to key value pairs in latter dicts. """ result = {} for dictionary in dict_args: result.update(dictionary) return result
索引遍历
可以根据键来直接进行元素访问 :
# Python 中对于访问不存在的键会抛出 KeyError 异常,需要先行判断或者使用 getprint 'cat' in d # Check if a dictionary has a given key; prints "True"# 如果直接使用 [] 来取值,需要先确定键的存在,否则会抛出异常print d['monk ey'] # KeyError: 'monk ey' not a key of d# 使用 get 函数则可以设置默认值print d.get('monk ey', 'N/A') # Get an element with a default; prints "N/A"print d.get('fish', 'N/A') # Get an element with a default; prints "wet"d.keys() # 使用 keys 方法可以获取所有的键
可以使用 for-in 来遍历数组 :
# 遍历键for key in d:# 比前一种方式慢for k in dict.keys(): ...# 直接遍历值for value in dict.itervalues(): ...# Python 2.x 中遍历键值for key, value in d.iteritems():# Python 3.x 中遍历键值for key, value in d.items():
其他序列类型
集合
# Same as {"a", "b","c"}normal_set = set(["a", "b","c"])# Adding an element to normal set is finenormal_set.add("d")print("Normal Set")print(normal_set)# A frozen setfrozen_set = frozenset(["e", "f", "g"])print("Frozen Set")print(frozen_set)# Uncommenting below line would cause error as# we are trying to add element to a frozen set# frozen_set.add("h")
Enum
class Enum(set): def __getattr__(self, name): if name in self: return name raise AttributeError
函数
函数定义
Python 中的函数使用 def 关键字进行定义,譬如 :
def sign(x): if x > 0: return 'positive' elif x < 0: return 'negative' else: return 'zero'for x in [-1, 0, 1]: print sign(x)# Prints "negative", "zero", "positive"
Python 支持运行时创建动态函数,也即是所谓的 lambda 函数:
def f(x): return x**2# 等价于g = lambda x: x**2
参数
Option Arguments: 不定参数
def example(a, b=None, *args, **kwargs): print a, b print args print kwargs example(1, "var", 2, 3, word="hello")# 1 var# (2, 3)# {'word': 'hello'}a_tuple = (1, 2, 3, 4, 5) a_dict = {"1":1, "2":2, "3":3} example(1, "var", *a_tuple, **a_dict)# 1 var# (1, 2, 3, 4, 5)# {'1': 1, '2': 2, '3': 3}
对于不定参数的调用,同样可以使用 **
运算符:
func(**{'type':'Event'})# 等价于func(type='Event')
生成器
def simple_generator_function(): yield 1 yield 2 yield 3for value in simple_generator_function(): print(value)# 输出结果# 1# 2# 3our_generator = simple_generator_function()next(our_generator)# 1next(our_generator)# 2next(our_generator)#3# 生成器典型的使用场景譬如无限数组的迭代def get_primes(number): while True: if is_prime(number): yield number number += 1
装饰器
装饰器是非常有用的设计模式 :
# 简单装饰器from functools import wrapsdef decorator(func): @wraps(func) def wrapper(*args, **kwargs): print('wrap function') return func(*args, **kwargs) return wrapper@decoratordef example(*a, **kw): passexample.__name__ # attr of function preserve# 'example'# Decorator# 带输入值的装饰器from functools import wrapsdef decorator_with_argument(val): def decorator(func): @wraps(func) def wrapper(*args, **kwargs): print "Val is {0}".format(val) return func(*args, **kwargs) return wrapper return decorator@decorator_with_argument(10)def example(): print "This is example function."example()# Val is 10# This is example function.# 等价于def example(): print "This is example function."example = decorator_with_argument(10)(example) example()# Val is 10# This is example function.
类与对象
类定义
Python 中对于类的定义也很直接 :
class Greeter(object): # Constructor def __init__(self, name): self.name = name # Create an instance variable # Instance method def greet(self, loud=False): if loud: print 'HELLO, %s!' % self.name.upper() else: print 'Hello, %s' % self.name g = Greeter('Fred') # Construct an instance of the Greeter classg.greet() # Call an instance method; prints "Hello, Fred"g.greet(loud=True) # Call an instance method; prints "HELLO, FRED!"
Managed Attributes: 受控属性
# property、setter、deleter 可以用于复写点方法class Example(object): def __init__(self, value): self._val = value @property def val(self): return self._val @val.setter def val(self, value): if not isintance(value, int): raise TypeError("Expected int") self._val = value @val.deleter def val(self): del self._val @property def square3(self): return 2**3ex = Example(123) ex.val = "str"# Traceback (most recent call last):# File "", line 1, in# File "test.py", line 12, in val# raise TypeError("Expected int")# TypeError: Expected int
类方法与静态方法
class example(object): @classmethod def clsmethod(cls): print "I am classmethod" @staticmethod def stmethod(): print "I am staticmethod" def instmethod(self): print "I am instancemethod"ex = example() ex.clsmethod()# I am classmethodex.stmethod()# I am staticmethodex.instmethod()# I am instancemethodexample.clsmethod()# I am classmethodexample.stmethod()# I am staticmethodexample.instmethod()# Traceback (most recent call last):# File "", line 1, in# TypeError: unbound method instmethod() ...
对象
实例化
属性操作
Python 中对象的属性不同于字典键,可以使用点运算符取值,直接使用 in 判断会存在问题 :
class A(object): @property def prop(self): return 3a = A()print "'prop' in a.__dict__ =", 'prop' in a.__dict__print "hasattr(a, 'prop') =", hasattr(a, 'prop')print "a.prop =", a.prop# 'prop' in a.__dict__ = False# hasattr(a, 'prop') = True# a.prop = 3
建议使用 hasattr、getattr 、 setattr 这种方式对于对象属性进行操作 :
class Example(object): def __init__(self): self.name = "ex" def printex(self): print "This is an example"# Check object has attributes# hasattr(obj, 'attr')ex = Example()hasattr(ex,"name")# Truehasattr(ex,"printex")# Truehasattr(ex,"print")# False# Get object attribute# getattr(obj, 'attr')getattr(ex,'name')# 'ex'# Set object attribute# setattr(obj, 'attr', value)setattr(ex,'name','example') ex.name# 'example'
异常与测试
异常处理
try
import systry: f = open('myfile.txt') s = f.readline() i = int(s.strip())except OSError as err: print("OS error: {0}".format(err))except ValueError: print("Could not convert data to an integer.")except: print("Unexpected error:", sys.exc_info()[0]) raise
class B(Exception): passclass C(B): passclass D(C): passfor cls in [B, C, D]: try: raise cls() except D: print("D") except C: print("C") except B: print("B")
Context Manager - with
with 常用于打开或者关闭某些资源 :
host = 'localhost'port = 5566with Socket(host, port) as s: while True: conn, addr = s.accept() msg = conn.recv(1024) print msg conn.send(msg) conn.close()
单元测试
from __future__ import print_functionimport unittestdef fib(n): return 1 if n<=2 else fib(n-1)+fib(n-2)def setUpModule(): print("setup module")def tearDownModule(): print("teardown module")class TestFib(unittest.TestCase): def setUp(self): print("setUp") self.n = 10 def tearDown(self): print("tearDown") del self.n @classmethod def setUpClass(cls): print("setUpClass") @classmethod def tearDownClass(cls): print("tearDownClass") def test_fib_assert_equal(self): self.assertEqual(fib(self.n), 55) def test_fib_assert_true(self): self.assertTrue(fib(self.n) == 55)if __name__ == "__main__": unittest.main()
存储
文件读写
路径处理
Python 内置的 __file__
关键字会指向当前文件的相对路径,可以根据它来构造绝对路径,或者索引其他文件 :
# 获取当前文件的相对目录dir = os.path.dirname(__file__) # src\app## once you're at the directory level you want, with the desired directory as the final path node:dirname1 = os.path.basename(dir) dirname2 = os.path.split(dir)[1] ## if you look at the documentation, this is exactly what os.path.basename does.# 获取当前代码文件的绝对路径,abspath 会自动根据相对路径与当前工作空间进行路径补全os.path.abspath(os.path.dirname(__file__)) # D:\WorkSpace\OWS\tool\ui-tool-svn\python\src\app# 获取当前文件的真实路径os.path.dirname(os.path.realpath(__file__)) # D:\WorkSpace\OWS\tool\ui-tool-svn\python\src\app# 获取当前执行路径os.getcwd()
可以使用 listdir、walk 、 glob 模块来进行文件枚举与检索:
# 仅列举所有的文件from os import listdirfrom os.path import isfile, join onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]# 使用 walk 递归搜索from os import walk f = []for (dirpath, dirnames, filenames) in walk(mypath): f.extend(filenames) break# 使用 glob 进行复杂模式匹配import globprint(glob.glob("/home/adam/*.txt"))# ['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]
简单文件读写
# 可以根据文件是否存在选择写入模式mode = 'a' if os.path.exists(writepath) else 'w'# 使用 with 方法能够自动处理异常with open("file.dat",mode) as f: f.write(...) ... # 操作完毕之后记得关闭文件 f.close()# 读取文件内容message = f.read()
- 点赞
- 收藏
- 关注作者
评论(0)