目录

  1. 项目背景
  2. 需求分析
  3. 整体设计
    1. 基础设计
    2. 细节设计
      1. 图像压缩:更快速、更省流量
      2. 状态设定:让复杂的权限变得清晰
      3. 资源评估:让资源分配更加合理
    3. 架构设计
      1. 服务端组成
        1. 文件上传方案
        2. 预热方案
      2. 数据库设计
  4. 项目开发
    1. 项目初始化
      1. 数据库初始化
      2. 存储桶初始化
    2. 小程序开发
      1. 页面开发
      2. 公共方法
      3. 核心页面的部分方法
        1. 上传文件
        2. 图片列表的多样性
    3. 服务端开发
      1. 数据库公共处理方法
      2. 登陆注册功能
      3. 全局定义/初始化
      4. 用户与相册权限确定方法
      5. 上传图片(获取图片上传地址)
      6. 异步操作(OSS触发器)
      7. 搜索方法
    4. 管理系统开发
  5. 项目预览
    1. 小程序端
    2. 管理页面
  6. 经验积累
    1. Web框架与阿里云函数计算
    2. 如何进行本地调试
  7. 总结

项目背景

小程序是一个非常有趣且繁荣的生态,将Serverless架构的技术红利作用在小程序生态,让本来开发效率极高的小程序工程,变得“效率更高,性能更强,系统更稳”。本文将会基于Serverless架构,开发一个人工智能相册系统,通过该系统的开发过程,可以对以下内容有部分的理解:

  • 如何低成本、高效率的开发一个Serverless应用
  • 如果将一个传统框架(Bottle、Flask等)快速迁移部署到Serverless架构上
  • 在Serverless架构上如何更好的实现文件上传功能
  • Serverless架构如何降低冷启动带来的影响
  • 如何优化Serverless项目,让其成本“更低”
  • 如何将深度学习与Serverless架构进行有机结合
  • 函数计算如何与硬盘、对象存储等产品进行联动
  • Serverless架构下如何进行用户登录、鉴权等

需求分析

在开始本案例之前,需要先明确一下基于Serverless架构的人工智能相册小程序的需求来源:我是一个比较喜欢出去旅行的人,经常会和朋友们去一些地方旅游,毕竟读万卷书不如行万里路。在每次旅行的时候,我是那种喜欢用手机拍一些照片的人,之后就会遇到两个比较有趣的事情:

  • 我和朋友手机中都有一些照片,我们需要把这些照片合并到一起。通常是我把我有的传给他们,他们把自己的再传给我;
  • 过了很久之后,我想找某张照片,我需要不断的翻我的相册,要找到大概的拍摄时间范围,再逐步的确定是那个照片;

所以,我就在想,能不能通过小程序,做一个相册系统,可以满足这样几个内容:

image

这里面提及的:

  • 创建相册、上传图片:这个是比较容易理解的,也是一个相册工具的基础能力;
  • 共享相册、共建相册:这个部分是比较有趣的,就是说使用者在创建相册时候,是可以决定其是否要让相册是私有的,还是和别人共享,或者共建的;
    • 共享就是说,我上传图片,别人可以看;
    • 共建就是说,我和别人可以一起上传图片到相册,共同维护一个相册;
  • 可以通过搜索找到目标图片:这个是一个比较有趣的能力,这个算是人工智能领域的图像描述/理解(Image Caption),即上传图片之后,计算机会自动转换成为文本并存储,再搜索的时候是可以通过文本搜索找到这张图片的,例如:

image

所存难点:

  • 这个相册相对来说是一个低频工具,如购买服务器,租用云主机,以及购买数据库等产品,可能需要一个持续性支出的费用,如何更加节约的让项目在需要的时候高性能的运作起来,这是一个难点;
  • 如果项目使用了Serverless架构,那么如何在函数计算上使用一个非常大的模型是一个难点,因为一般的云厂商所提供的Runtime,其整体的空间只有500M左右;
  • 如果使用Serverless架构,如何在本地调试,并且如何将传统框架,部署到线上是一个问题;
  • 在Serverless架构下如何上传文件,因为Serverless重的计算平台即FaaS平台,通常都是有事件驱动,而一般情况下云厂商对这个传输的体积进行了一定的限制,通常是6M左右,如果直接上传图片,显然会出现大量的Event体积超限问题,那么在Serverless架构下,如何安全的,高性能的,且优雅的上传文件,就显得尤为重要了;

整体设计

基础设计

根据分析的需求,可以基本确定,这个项目可能会氛围三个主要部分:

  • 小程序:即客户端
  • 服务端:即小程序交互所需要的API系统
  • 管理系统:即管理员可以对全局的一些观测、查看等
    整个项目的结构大致为:

image

项目的设计草图大概为:

image

细节设计

图像压缩:更快速、更省流量

根据需求分析,可以知道,我们即将要做的项目实际上是一个相册小程序,那么相册小程序,他实际上就会包括两个部分:

  • 图片列表
  • 图片详情

如果整体都加载原图,一方面对用户流量、网络质量是一个考验,对用户的小程序客户端性能,也是一个考验,另一方面对于我们服务端的流量费用支出,也是一个非常严峻的考验,所以这里将会使用图像的压缩与图像的原图组合,来进行优化。例如:所有的列表页加载的图片均是压缩图,点击查看详情加载的是原图。这样,不仅保证的性能、成本,也进一步的提高了整体效率,体验。

状态设定:让复杂的权限变得清晰

为了更好的管理用户和相册的关系以及用户和用户的关系,额外引入用户相册表和用户用户表,来进行相关权限的控制。

用户的关系可能包括:

  • 自我关系
  • 好友关系
  • 黑名单关系
  • 无关系

用户和相册关系可能包括:

  • 自己的相册
  • 别人分享的相册
    • 有查看权限
    • 无查看权限
  • 别人共建的相册
    • 有查看权限
    • 无查看权限

相册自身的状态:

  • 私密相册
  • 共享相册
  • 共建相册
  • 增量用户设置为私密状态

这些复杂的权限和状态设计,将会是构成用户用户关系,用户相册关系的重要组成部分。

资源评估:让资源分配更加合理

资源分配实际上是一个老生常谈的问题,在云主机时代,我们如果想要上一个项目,必然离不开资源的评估,例如说购买几核几G,带宽多大的产品或者服务,而决定这个资源限制的实际上可能包括我们对业务体积的评估,对用户体积的评估。如果整个业务采用了Serverless架构,那么是否要对所使用的实例规格进行评估?如何评估才会更合理?以及所有接口是否都要放在一起?

此处的做做法是:

  • 拆分资源消耗比较大的接口
  • 按照业务、场景进行接口分类

例如相对简单的接口,可能只存在对数据库的增删改查以及部分简单的逻辑,那么这类的接口可,可以按照业务分类,场景分类进行一个整体的划分;这类接口所使用的函数规格基本一致;还有一部分接口,例如对图片进行描述(基于Image Caption的想过逻辑)等,这一部分的接口所需要的资源相对来说一定是巨大的,所以这一部分接口要单独放在某些函数中。通过对资源的评估,可以让我们通过对实例规格的细化,进一步的节约成本。

架构设计

Serverless架构相对来说:

  • Serverless架构的维护成本极低,可以在精力支出上更加小;
  • Serverless架构采用按量付费的模式,可以在成本上更加节约;

所以,为了提升开发效率,并且降低整体成本,这个项目将会基于Serverless架构来进行,在这里面,尤其是后者,即降低成本、按量付费,是促使该项目上Serverless架构的一个非常重要的因素,因为该项目中涉及到深度学习,图像压缩转换等逻辑,可能在某些时间点对服务器有比较大的压力,当图片上传完成之后,又对服务器压力很小,所以这里在资源评估的时候就有着非常大的“歧义”,而Serverless架构在按量付费的模式上,加上各种BaaS产品的触发器等,不仅仅可以更好的解决成本问题,也可以解决部分运维问题,可谓是一举多得。

image

整个项目会使用到函数计算、HTTP触发器、以及对象存储、NAS等服务。其中:

  • HTTP触发器解决我们传统服务器中Nginx相关的设置;
  • 函数计算为我们的项目提供足够的算力;
  • 对象存储将会存放用户上传的照片等资源;
  • NAS将会承载两部分工作:
    • 有一些代码包或者模型比较大的项目,可以通过将依赖等文件放置在NAS中,来确保在500M的实例中,做更多的事情;
    • 为了进一步降低成本,将用使用SQLite数据库作为该项目的数据库,并且存放在NAS中;

同时,为了更符合传统的开发习惯,和提升开发效率,也为了在开发期间可以在本地有一个更加亲切的调试环境、方案,该项目将会采用一些传统Web框架来直接进行开发,最后通过工具推到线上的环境中:

image

该项目的技术选型:

客户端:微信小程序,使用ColorUI:

image

  • 服务端:后台接口采用Bottle框架,Bottle是一个非常简洁,轻量web框架,与Django形成鲜明的对比,它只由一个单文件组成,文件总共只有3700多行代码,依赖只有python标准库。但是麻雀虽小五脏俱全,基本的功能都有实现,很适合做一些小的web应用。
  • 管理系统:采用Flask框架,使用SQLite-Web来直接实现,SQLite-Web是用Python编写的基于web的SQLite数据库浏览器。
  • 存储系统:采用的是OSS对象存储作为相册存储,采用NAS作为依赖,数据库等存储。
  • 计算平台:全部采用函数计算(FC)。

服务端组成

服务端主要有两部分组成,一部分是同步任务,一部分是异步任务。

文件上传方案

这其中,同步任务中有一个请求时增加图片,本项目所采用的增加图片方法是直传OSS+异步处理。

即当用户本地上传图片的时候,系统请求后台增加图片,系统增加图片数据之后会返回上传地址,客户端将图片直接上传,然后异步触发修改图片状态、图像压缩、图像理解等任务:

image

预热方案

众所周知,Serverless架构存在较为严重的冷启动问题,为了解决冷启动的问题很多厂商也都推出预留实例,但是实际上我个人就目前为止更加偏向自制预热方案:

image

所谓的自制预热方案就是写一个函数定时触发器,请求要被预热的函数。

例如,我待预热的函数中,有一个方法是:

1
2
3
4
@bottle.route('/prewarm', method='GET')
def preWarm():
time.sleep(3)
return response("Pre Warm")

我的预热函数,可以针对这个方法进行请求:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# -*- coding: utf-8 -*-
import _thread
import urllib.request
import time

def preWarm(number):
print("%s\t start time: %s"%(number, time.time()))
url = "http://www.aialbum.net/prewarm"
print(urllib.request.urlopen(url).read().decode("utf-8"))
print("%s\t end time: %s" % (number, time.time()))

def handler(event, context):
try:
for i in range(0,6):
_thread.start_new_thread( preWarm, (i, ) )
except:
print ("Error: 无法启动线程")


time.sleep(5)

return True

接下来,针对预热函数进行定时触发(例如每3分钟触发一次),这样目标函数就可以保证6个实例的预热了。

数据库设计

针对需求进行分析和深入理解,可以确定数据库的结构为:

image

项目开发

项目初始化

数据库初始化

与其说是项目初始化,不如说是初始化一个数据库:

相册:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CREATE TABLE Album  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
name CHAR(255) NOT NULL,
create_time CHAR(255) NOT NULL,
record_time CHAR(255) NOT NULL,
place CHAR(255),
acl INT NOT NULL,
password CHAR(255),
description TEXT,
remark TEXT,
lifecycle_state INT,
photo_count INT NOT NULL,
acl_state INT,
picture CHAR(255)
)

图片:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
CREATE TABLE Photo  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
create_time TEXT NOT NULL,
update_time TEXT NOT NULL,
album CHAR(255) NOT NULL,
file_token CHAR(255) NOT NULL,
user INT NOT NULL,
description CHAR(255) NOT NULL,
remark TEXT,
state INT NOT NULL,
delete_time TEXT,
place TEXT,
name CHAR(255),
views INT NOT NULL,
delete_user CHAR(255),
"user_description" TEXT
)

标签:

1
2
3
4
5
CREATE TABLE Tags  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
name CHAR(255) NOT NULL UNIQUE,
remark TEXT
)

用户:

1
2
3
4
5
6
7
8
9
10
11
12
CREATE TABLE User  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
username CHAR(255) NOT NULL,
token CHAR(255) NOT NULL UNIQUE,
avatar CHAR(255) NOT NULL,
secret CHAR(255) NOT NULL UNIQUE,
place CHAR(255),
gender INT NOT NULL,
register_time CHAR(255) NOT NULL,
state INT NOT NULL,
remark TEXT
)

用户关系:

1
2
3
4
5
6
7
8
CREATE TABLE UserRelationship  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
origin INT NOT NULL,
target INT NOT NULL,
type INT NOT NULL,
relationship CHAR(255) NOT NULL UNIQUE,
remark TEXT
)

相册标签关系:

1
2
3
4
5
CREATE TABLE AlbumTag  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
album INT NOT NULL,
tag INT NOT NULL
)

相册用户关系

1
2
3
4
5
6
7
8
CREATE TABLE AlbumUser  (
id INTEGER PRIMARY KEY autoincrement NOT NULL,
user INT NOT NULL,
album INT NOT NULL,
type INT NOT NULL,
album_user CHAR(255) NOT NULL UNIQUE,
remark TEXT
)

存储桶初始化

完成之后,我们还需要创建一个存储桶,例如我在OSS中创建一个存储桶:

image

小程序开发

页面开发

小程序的开发,主要包括两部分,一部分是页面布局,另一部分是数据的渲染。

其中,页面布局部分主要是使用ColorUI组件,例如规定一个页面的整体样式:

image

然后根据所设计的页面以及现有的组件,进行灵活的拼装,例如:

image

公共方法

完成小程序页面的开发,开始对小程序的数据部分进行统一的抽象,例如请求后端的方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// 统一请求接口
doPost: async function (uri, data, option = {
secret: true,
method: "POST"
}) {
let times = 20
const that = this
let initStatus = false
if (option.secret) {
while (!initStatus && times > 0) {
times = times - 1
if (this.globalData.secret) {
data.secret = this.globalData.secret
initStatus = true
break
}
await that.sleep(500)
}
} else {
initStatus = true
}
if (initStatus) {
return new Promise((resolve, reject) => {
wx.request({
url: that.url + uri,
data: data,
header: {
"Content-Type": "text/plain"
},
method: option.type ? option.type : "POST",
success: function (res) {
console.log("RES: ", res)
if (res.data.Body && res.data.Body.Error && res.data.Body.Error == "UserInformationError") {
wx.redirectTo({
url: '/pages/login/index',
})
} else {
resolve(res.data)
}
},
fail: function (res) {
reject(null)
}
})
})
}
}

例如登陆模块:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
login: async function () {
const that = this
const postData = {}
let initStatus = false
while (!initStatus) {
if (this.globalData.token) {
postData.token = this.globalData.token
initStatus = true
break
}
await that.sleep(200)
}
if (this.globalData.userInfo) {
postData.username = this.globalData.userInfo.nickName
postData.avatar = this.globalData.userInfo.avatarUrl
postData.place = this.globalData.userInfo.country || "" + this.globalData.userInfo.province || "" + this.globalData.userInfo.city || ""
postData.gender = this.globalData.userInfo.gender
}
try {
this.doPost('/login', postData, {
secret: false,
method: "POST"
}).then(function (result) {
if (result.secret) {
that.globalData.secret = result.secret
} else {
that.responseAction(
"登陆失败",
String(result.Body.Message)
)
}
})
} catch (ex) {
this.failRequest()
}
}

核心页面的部分方法

上传文件

在做图片上传到阿里云对象存储的时候,遇到了一个尴尬的问题:

  • 直接通过密钥上传,貌似不是很安全
  • 通过预签名上传,貌似很安全,但是小程序和对象存储有一些冲突:
    • 小程序的uploadFile方法,只支持POST
    • 对象存储的SDK预签名貌似只支持PUT与GET
    • 最终解决方案是,服务端通过对象存储的SDK生成临时地址:
      1
      2
      3
      4
      5
      6
      uploadUrl = "https://upload.aialbum.net"
      replaceUrl = lambda method: downloadUrl if method == "GET" else uploadUrl
      getSourceUrl = lambda objectName, method="GET", expiry=600: bucket.sign_url(method, objectName, expiry)
      SignUrl = lambda objectName, method="GET", expiry=600: getSourceUrl(objectName, method, expiry).replace(sourcePublicUrl, replaceUrl(method))
      # 使用方法:
      returnData = {"index": index, "url": SignUrl(file_path, "PUT", 600)}
      小程序本身有一个上传文件的API:

image

该API的描述文档中有:“客户端发起一个 HTTPS POST 请求”,而在我们使用的阿里云对象存储服务中,我们使用的预签名能力,仅支持GET方法和PUT方法的预签名,所以这里没有办法非常愉快的通过预签名+wx.uploadFile(Object object)来上传文件,所以此时需要:

  • 读取文件
  • 通过wx.request(Object object)指定PUT方法来进行上传

基本的实现过程为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
uploadData: function () {
const that = this
const uploadFiles = this.data.imageType == 1 ? this.data.originalPhotos : this.data.thumbnailPhotos
for (let i = 0; i < uploadFiles.length; i++) {
if (that.data.imgListState[i] != "complete") {
const imgListState = that.data.imgListState
try {
app.doPost('/picture/upload/url/get', {
album: that.data.album[that.data.index].id,
index: i,
file: uploadFiles[i]
}).then(function (result) {
if (!result.Body.Error) {
imgListState[result.Body.index] = 'uploading'
that.setData({
imgListState: imgListState
})
wx.request({
method: 'PUT',
url: result.Body.url,
data: wx.getFileSystemManager().readFileSync(uploadFiles[result.Body.index]),
header: {
"Content-Type": " "
},
success(res) {
},
fail(res) {
},
complete(res) {
}
})
} else {
}
})
} catch (ex) {
}
}
}
}
图片列表的多样性

为了让这个工具更加符合我们常见的相册系统,可以通过屏幕操作来对列表进行部分操作:

image

即可以通过双指进行放大缩小的操作来实现相册每行显示的数量:

image

这一部分的实现方案基本上是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
/**
* 调整图片
*/
touchendCallback: function (e) {
this.setData({
distance: null
})
},

touchmoveCallback: function (e) {
if (e.touches.length == 1) {
return
}
// 监测到两个触点
let xMove = e.touches[1].clientX - e.touches[0].clientX
let yMove = e.touches[1].clientY - e.touches[0].clientY
let distance = Math.sqrt(xMove * xMove + yMove * yMove)
if (this.data.distance) {
// 已经存在前置状态
let tempDistance = this.data.distance - distance
let scale = parseInt(Math.abs(tempDistance / this.data.windowRate))
if (scale >= 1) {
let rowCount = tempDistance > 0 ? this.data.rowCount + scale : this.data.rowCount - scale
rowCount = rowCount <= 1 ? 1 : (rowCount >= 5 ? 5 : rowCount)
this.setData({
rowCount: rowCount,
rowWidthHeight: wx.getSystemInfoSync().windowWidth / rowCount,
distance: distance
})
}
} else {
// 不存在前置状态
this.setData({
distance: distance
})
}
},

即通过确定屏幕触点的数量,以及通过勾股定理,来确定两个手指之间方法和缩小的距离。

服务端开发

服务端的开发主要包括两个函数:

  • Bottle的同步接口
  • 异步功能的函数

其中Bottle的同步接口,大部分都为数据库的增删改查操作,以及权限的校验操作。

数据库公共处理方法

例如数据库的统一处理方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# 数据库操作
def Action(sentence, data=(), throw=True):
'''
数据库操作
:param throw: 异常控制
:param sentence: 执行的语句
:param data: 传入的数据
:return:
'''
try:
for i in range(0,5):
try:
cursor = connection.cursor()
result = cursor.execute(sentence, data)
connection.commit()
return result
except Exception as e:
if "disk I/O error" in str(e):
time.sleep(0.2)
continue
elif "lock" in str(2):
time.sleep(1.1)
continue
else:
raise e
except Exception as err:
print(err)
if throw:
raise err
else:
return False

登陆注册功能

登陆注册功能:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# 登陆功能
@bottle.route('/login', method='POST')
def login():
try:
postData = json.loads(bottle.request.body.read().decode("utf-8"))
token = postData.get('token', None)
username = postData.get('username', '')
avatar = postData.get('avatar', getAvatar())
place = postData.get('place', "太阳系 地球")
gender = postData.get('gender', "-1")
tempSecret = getMD5(str(token)) + getRandomStr(50)
if token:
# 如果在数据库,则更新并且登陆,否则进行注册
print("Got token.")
dbResult = Action("SELECT * FROM User WHERE `token`=?;", (token,))
user = dbResult.fetchone()
if user:
print("User exists.")
tempSecret = user[4]
# 判断数据是否一致,并决定是否启动更新工作
if not (username == user[1] and avatar == user[3] and place == user[5] and gender == user[6]):
# 更新操作
print("User exists. Updating ...")
updateStmt = "UPDATE User SET `username`=?, `avatar`=?, `place`=?, `gender`=? WHERE `id`=?;"
Action(updateStmt, (username, avatar, place, gender, user[0]))
else:
print("User does not exists. Creating ...")
# 未搜索到数据,数据入库
insertStmt = ("INSERT INTO User(`username`, `token`, `avatar`, `secret`, `place`, `gender`, "
"`register_time`, `state`, `remark`) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?);")
Action(insertStmt, (username, token, avatar, tempSecret, place, gender, str(getTime()), 1, ''))
# 完成之后,再查一次数据
print("Getting user information ...")
userData = getUserBySecret(tempSecret)
return userData if userData else response(ERROR['SystemError'], 'SystemError')
else:
return response(ERROR['ParameterException'], 'ParameterException')
except Exception as e:
print("Error: ", e)
return response(ERROR['SystemError'], 'SystemError')

全局定义/初始化

当然,为了让一些操作更简单,也抽象出来了一些Lambda方法和定义了一系列的全局变量等:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# oss bucket对象
bucket = oss2.Bucket(oss2.Auth(AccessKeyId, AccessKeySecret), OSS_REGION_ENDPOINT[Region]['public'], Bucket)
# 数据库链接对象
connection = sqlite3.connect(Database, timeout=2)
# 预签名操作
ossPublicUrl = OSS_REGION_ENDPOINT[Region]['public']
sourcePublicUrl = "http://%s.%s" % (Bucket, ossPublicUrl)
downloadUrl = "https://download.aialbum.net"
uploadUrl = "https://upload.aialbum.net"
replaceUrl = lambda method: downloadUrl if method == "GET" else uploadUrl
getSourceUrl = lambda objectName, method="GET", expiry=600: bucket.sign_url(method, objectName, expiry)
SignUrl = lambda objectName, method="GET", expiry=600: getSourceUrl(objectName, method, expiry).replace(sourcePublicUrl, replaceUrl(method))
thumbnailKey = lambda key: "photo/thumbnail/%s" % (key) if bucket.object_exists("photo/thumbnail/%s" % (key)) else "photo/original/%s" % (key)
# 统一返回结果
response = lambda message, error=False: {'Id': str(uuid.uuid4()),
'Body': {
"Error": error,
"Message": message,
} if error else message}
# 获取默认头像
defaultPicture = "%s/static/images/%s/%s.jpg"
getAvatar = lambda: defaultPicture % (downloadUrl, "avatar", random.choice(range(1, 6)))
getAlbumPicture = lambda: defaultPicture % (downloadUrl, "album", random.choice(range(1, 6)))
# 获取随机字符串
seeds = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' * 100
getRandomStr = lambda num=200: "".join(random.sample(seeds, num))
# md5加密
getMD5 = lambda content: hashlib.md5(content.encode("utf-8")).hexdigest()
# 获取格式化时间
getTime = lambda: time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())

用户与相册权限确定方法

当然,这个项目中会经常性的判断用户和相册之间的关系,所谓的用户和相册的关系在该项目中是负责的,所以要有这样一个较为繁琐的流程:

image

所以此时也可以添加一个通用方法,用来进行相册与用户之间的权限:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# 相册权限鉴定
def checkAlbumPermission(albumId, userId, password=None):
'''
:param albumId: 相册ID
:param userId: 用户ID
:param password: 密码,默认为空
:return:
-1: 需要密码
0: 无权限
1: 可查看
2: 可编辑
3: 自己的相册
'''

deleteAlbumUser = lambda user, album: Action("DELETE FROM AlbumUser WHERE album=? AND user=?", (album, user), False)

album = Action(("SELECT albumUser.id albumUser_id, album.id album_id, * FROM AlbumUser AS albumUser "
"INNER JOIN Album AS album WHERE albumUser.`album`=album.`id` "
"AND albumUser.`album`=? AND albumUser.`type`=1;"), (albumId,)).fetchone()

# 相册不存在
if not album:
return 0

# 有相册关系,且是自己的
if album['user'] == userId:
# 自己的相册,最高权限,直接返回
return 3

# 如果相册已经关闭了共享
if album['acl'] == 0:
# 相册已经转为私有权限
deleteAlbumUser(userId, albumId)
return 0

tempAlbum = Action("SELECT * FROM AlbumUser WHERE user=? AND album=?;", (userId, albumId), False).fetchone()
# 没有相册关系, 且不再提供额外授权
if not tempAlbum and album["acl_state"] == 1:
return 0

# 相册未关闭共享,但是有密码
if album['password'] and album['password'] != password:
# 需要密码,但是密码错误
return -1

# 如果用户在黑名单中,则无权限
searchStmt = "SELECT * FROM UserRelationship WHERE `origin`=? AND `target`=?;"
userRelationship = Action(searchStmt, (album['user'], userId)).fetchone()
if userRelationship and userRelationship['type'] == -1:
deleteAlbumUser(userId, albumId)
return 0

if not userRelationship:
# 添加用户关系
insertStmt = ("INSERT INTO UserRelationship (`origin`, `target`, `type`, `relationship`, `remark`) "
"VALUES (?, ?, ?, ?, ?)")
Action(insertStmt, (userId, album["user"], 1, "%s->%s" % (userId, album["user"]), ""), False)
Action(insertStmt, (album["user"], userId, 1, "%s->%s" % (album["user"], userId), ""), False)

if not tempAlbum:
# 添加相册关系
insertStmt = "INSERT INTO AlbumUser(`user`, `album`, `type`, `album_user`, `remark`) VALUES (?, ?, ?, ?, ?);"
Action(insertStmt, (userId, albumId, 2, "%s-%s" % (userId, albumId), ""), False)

return album['acl']

当需要判定用户和相册之间的关系,可以直接调用方法,根据获得到的返回值,进行下一步操作。

上传图片(获取图片上传地址)

以上传图片为例(确切来说是获取上传图片地址),用户指定一个相册后,可获取参数之后,直接判断相册和用户之间的关系,当用户有权限上传图片到该相册,可以进行下一步操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# 图片管理:新增图片
@bottle.route('/picture/upload/url/get', method='POST')
def getPictureUploadUrl():
try:
# 参数获取
postData = json.loads(bottle.request.body.read().decode("utf-8"))
secret = postData.get('secret', None)
albumId = postData.get('album', None)
index = postData.get('index', None)
password = postData.get('password', None)
name = postData.get('name', "")
file = postData.get('file', "")

tempFileEnd = "." + file.split(".")[-1]
tempFileEnd = tempFileEnd if tempFileEnd in ['.png', '.jpg', '.bmp', 'jpeg', '.gif', '.svg', '.psd'] else ".png"

file_token = getMD5(str(albumId) + name + secret) + getRandomStr(50) + tempFileEnd
file_path = "photo/original/%s" % (file_token)

# 参数校验
if not checkParameter([secret, albumId, index]):
return False, response(ERROR['ParameterException'], 'ParameterException')

# 查看用户是否存在
user = Action("SELECT * FROM User WHERE `secret`=? AND `state`=1;", (secret,)).fetchone()
if not user:
return response(ERROR['UserInformationError'], 'UserInformationError')

# 权限鉴定
if checkAlbumPermission(albumId, user["id"], password) < 2:
return response(ERROR['PermissionException'], 'PermissionException')

insertStmt = ("INSERT INTO Photo (`create_time`, `update_time`, `album`, `file_token`, `user`, `description`, "
"`delete_user`, `remark`, `state`, `delete_time`, `views`, `place`, `name`) "
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)")
insertData = ("", getTime(), albumId, file_token, user["id"], "", "", "", 0, "", 0, "", name)
Action(insertStmt, insertData)
return response({"index": index, "url": SignUrl(file_path, "PUT", 600)})
except Exception as e:
print("Error: ", e)
return response(ERROR['SystemError'], 'SystemError')

异步操作(OSS触发器)

针对异步方法,实际上需要使用OSS触发器:

image

该函数,未使用框架,是函数计算的原生开发。这里面主要包括几个部分:

图像格式转换:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def PNG_JPG(PngPath, JpgPath):
img = cv.imread(PngPath, 0)
w, h = img.shape[::-1]
infile = PngPath
outfile = JpgPath
img = Image.open(infile)
img = img.resize((int(w / 2), int(h / 2)), Image.ANTIALIAS)
try:
if len(img.split()) == 4:
r, g, b, a = img.split()
img = Image.merge("RGB", (r, g, b))
img.convert('RGB').save(outfile, quality=70)
os.remove(PngPath)
else:
img.convert('RGB').save(outfile, quality=70)
os.remove(PngPath)
return outfile
except Exception as e:
print(e)
return False

图像的压缩:

1
2
3
4
5
image = Image.open(localSourceFile)
width = 450
height = image.size[1] / (image.size[0] / width)
imageObj = image.resize((int(width), int(height)))
imageObj.save(localTargetFile)

图像的理解:这一部分涉及到了深度学习相关知识,其中核心代码包括:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
base_model.py:
import os
import numpy as np
import tensorflow as tf
import pickle
from tqdm import tqdm
from utils.nn import NN
from utils.misc import ImageLoader, CaptionData, TopN


class BaseModel(object):
def __init__(self, config):
self.config = config
self.is_train = False
self.train_cnn = self.is_train and config.train_cnn
self.image_loader = ImageLoader('./utils/ilsvrc_2012_mean.npy')
self.image_shape = [224, 224, 3]
self.nn = NN(config)
self.global_step = tf.Variable(0, name='global_step', trainable=False)
self.build()

def build(self):
raise NotImplementedError()

def beam_search(self, sess, image_files, vocabulary):
"""Use beam search to generate the captions for a batch of images."""
# Feed in the images to get the contexts and the initial LSTM states
config = self.config
images = self.image_loader.load_images(image_files)
contexts, initial_memory, initial_output = sess.run(
[self.conv_feats, self.initial_memory, self.initial_output],
feed_dict={self.images: images})

partial_caption_data = []
complete_caption_data = []
for k in range(config.batch_size):
initial_beam = CaptionData(sentence=[],
memory=initial_memory[k],
output=initial_output[k],
score=1.0)
partial_caption_data.append(TopN(config.beam_size))
partial_caption_data[-1].push(initial_beam)
complete_caption_data.append(TopN(config.beam_size))

# Run beam search
for idx in range(config.max_caption_length):
partial_caption_data_lists = []
for k in range(config.batch_size):
data = partial_caption_data[k].extract()
partial_caption_data_lists.append(data)
partial_caption_data[k].reset()

num_steps = 1 if idx == 0 else config.beam_size
for b in range(num_steps):
if idx == 0:
last_word = np.zeros((config.batch_size), np.int32)
else:
last_word = np.array([pcl[b].sentence[-1]
for pcl in partial_caption_data_lists],
np.int32)

last_memory = np.array([pcl[b].memory
for pcl in partial_caption_data_lists],
np.float32)
last_output = np.array([pcl[b].output
for pcl in partial_caption_data_lists],
np.float32)

memory, output, scores = sess.run(
[self.memory, self.output, self.probs],
feed_dict={self.contexts: contexts,
self.last_word: last_word,
self.last_memory: last_memory,
self.last_output: last_output})

# Find the beam_size most probable next words
for k in range(config.batch_size):
caption_data = partial_caption_data_lists[k][b]
words_and_scores = list(enumerate(scores[k]))
words_and_scores.sort(key=lambda x: -x[1])
words_and_scores = words_and_scores[0:config.beam_size + 1]

# Append each of these words to the current partial caption
for w, s in words_and_scores:
sentence = caption_data.sentence + [w]
score = caption_data.score * s
beam = CaptionData(sentence,
memory[k],
output[k],
score)
if vocabulary.words[w] == '.':
complete_caption_data[k].push(beam)
else:
partial_caption_data[k].push(beam)

results = []
for k in range(config.batch_size):
if complete_caption_data[k].size() == 0:
complete_caption_data[k] = partial_caption_data[k]
results.append(complete_caption_data[k].extract(sort=True))

return results

def load(self, sess, model_file=None):
""" Load the model. """
config = self.config
if model_file is not None:
save_path = model_file
else:
info_path = os.path.join(config.save_dir, "config.pickle")
info_file = open(info_path, "rb")
config = pickle.load(info_file)
global_step = config.global_step
info_file.close()
save_path = os.path.join(config.save_dir,
str(global_step) + ".npy")

print("Loading the model from %s..." % save_path)
data_dict = np.load(save_path, allow_pickle=True, encoding="bytes").item()
count = 0
for v in tqdm(tf.compat.v1.global_variables()):
if v.name in data_dict.keys():
sess.run(v.assign(data_dict[v.name]))
count += 1
print("%d tensors loaded." % count)
generator.py:
import tensorflow as tf
from base_model import BaseModel

class CaptionGenerator(BaseModel):
def build(self):
""" Build the model. """
self.build_cnn()
self.build_rnn()
if self.is_train:
self.build_optimizer()
self.build_summary()

def build_cnn(self):
""" Build the CNN. """
print("Building the CNN...")
if self.config.cnn == 'vgg16':
self.build_vgg16()
else:
self.build_resnet50()
print("CNN built.")

def build_vgg16(self):
""" Build the VGG16 net. """
config = self.config

images = tf.compat.v1.placeholder(
dtype=tf.float32,
shape=[config.batch_size] + self.image_shape)

conv1_1_feats = self.nn.conv2d(images, 64, name='conv1_1')
conv1_2_feats = self.nn.conv2d(conv1_1_feats, 64, name='conv1_2')
pool1_feats = self.nn.max_pool2d(conv1_2_feats, name='pool1')

conv2_1_feats = self.nn.conv2d(pool1_feats, 128, name='conv2_1')
conv2_2_feats = self.nn.conv2d(conv2_1_feats, 128, name='conv2_2')
pool2_feats = self.nn.max_pool2d(conv2_2_feats, name='pool2')

conv3_1_feats = self.nn.conv2d(pool2_feats, 256, name='conv3_1')
conv3_2_feats = self.nn.conv2d(conv3_1_feats, 256, name='conv3_2')
conv3_3_feats = self.nn.conv2d(conv3_2_feats, 256, name='conv3_3')
pool3_feats = self.nn.max_pool2d(conv3_3_feats, name='pool3')

conv4_1_feats = self.nn.conv2d(pool3_feats, 512, name='conv4_1')
conv4_2_feats = self.nn.conv2d(conv4_1_feats, 512, name='conv4_2')
conv4_3_feats = self.nn.conv2d(conv4_2_feats, 512, name='conv4_3')
pool4_feats = self.nn.max_pool2d(conv4_3_feats, name='pool4')

conv5_1_feats = self.nn.conv2d(pool4_feats, 512, name='conv5_1')
conv5_2_feats = self.nn.conv2d(conv5_1_feats, 512, name='conv5_2')
conv5_3_feats = self.nn.conv2d(conv5_2_feats, 512, name='conv5_3')

reshaped_conv5_3_feats = tf.reshape(conv5_3_feats,
[config.batch_size, 196, 512])

self.conv_feats = reshaped_conv5_3_feats
self.num_ctx = 196
self.dim_ctx = 512
self.images = images

def build_resnet50(self):
""" Build the ResNet50. """
config = self.config

images = tf.placeholder(
dtype=tf.float32,
shape=[config.batch_size] + self.image_shape)

conv1_feats = self.nn.conv2d(images,
filters=64,
kernel_size=(7, 7),
strides=(2, 2),
activation=None,
name='conv1')
conv1_feats = self.nn.batch_norm(conv1_feats, 'bn_conv1')
conv1_feats = tf.nn.relu(conv1_feats)
pool1_feats = self.nn.max_pool2d(conv1_feats,
pool_size=(3, 3),
strides=(2, 2),
name='pool1')

res2a_feats = self.resnet_block(pool1_feats, 'res2a', 'bn2a', 64, 1)
res2b_feats = self.resnet_block2(res2a_feats, 'res2b', 'bn2b', 64)
res2c_feats = self.resnet_block2(res2b_feats, 'res2c', 'bn2c', 64)

res3a_feats = self.resnet_block(res2c_feats, 'res3a', 'bn3a', 128)
res3b_feats = self.resnet_block2(res3a_feats, 'res3b', 'bn3b', 128)
res3c_feats = self.resnet_block2(res3b_feats, 'res3c', 'bn3c', 128)
res3d_feats = self.resnet_block2(res3c_feats, 'res3d', 'bn3d', 128)

res4a_feats = self.resnet_block(res3d_feats, 'res4a', 'bn4a', 256)
res4b_feats = self.resnet_block2(res4a_feats, 'res4b', 'bn4b', 256)
res4c_feats = self.resnet_block2(res4b_feats, 'res4c', 'bn4c', 256)
res4d_feats = self.resnet_block2(res4c_feats, 'res4d', 'bn4d', 256)
res4e_feats = self.resnet_block2(res4d_feats, 'res4e', 'bn4e', 256)
res4f_feats = self.resnet_block2(res4e_feats, 'res4f', 'bn4f', 256)

res5a_feats = self.resnet_block(res4f_feats, 'res5a', 'bn5a', 512)
res5b_feats = self.resnet_block2(res5a_feats, 'res5b', 'bn5b', 512)
res5c_feats = self.resnet_block2(res5b_feats, 'res5c', 'bn5c', 512)

reshaped_res5c_feats = tf.reshape(res5c_feats,
[config.batch_size, 49, 2048])

self.conv_feats = reshaped_res5c_feats
self.num_ctx = 49
self.dim_ctx = 2048
self.images = images

def resnet_block(self, inputs, name1, name2, c, s=2):
""" A basic block of ResNet. """
branch1_feats = self.nn.conv2d(inputs,
filters=4 * c,
kernel_size=(1, 1),
strides=(s, s),
activation=None,
use_bias=False,
name=name1 + '_branch1')
branch1_feats = self.nn.batch_norm(branch1_feats, name2 + '_branch1')

branch2a_feats = self.nn.conv2d(inputs,
filters=c,
kernel_size=(1, 1),
strides=(s, s),
activation=None,
use_bias=False,
name=name1 + '_branch2a')
branch2a_feats = self.nn.batch_norm(branch2a_feats, name2 + '_branch2a')
branch2a_feats = tf.nn.relu(branch2a_feats)

branch2b_feats = self.nn.conv2d(branch2a_feats,
filters=c,
kernel_size=(3, 3),
strides=(1, 1),
activation=None,
use_bias=False,
name=name1 + '_branch2b')
branch2b_feats = self.nn.batch_norm(branch2b_feats, name2 + '_branch2b')
branch2b_feats = tf.nn.relu(branch2b_feats)

branch2c_feats = self.nn.conv2d(branch2b_feats,
filters=4 * c,
kernel_size=(1, 1),
strides=(1, 1),
activation=None,
use_bias=False,
name=name1 + '_branch2c')
branch2c_feats = self.nn.batch_norm(branch2c_feats, name2 + '_branch2c')

outputs = branch1_feats + branch2c_feats
outputs = tf.nn.relu(outputs)
return outputs

def resnet_block2(self, inputs, name1, name2, c):
""" Another basic block of ResNet. """
branch2a_feats = self.nn.conv2d(inputs,
filters=c,
kernel_size=(1, 1),
strides=(1, 1),
activation=None,
use_bias=False,
name=name1 + '_branch2a')
branch2a_feats = self.nn.batch_norm(branch2a_feats, name2 + '_branch2a')
branch2a_feats = tf.nn.relu(branch2a_feats)

branch2b_feats = self.nn.conv2d(branch2a_feats,
filters=c,
kernel_size=(3, 3),
strides=(1, 1),
activation=None,
use_bias=False,
name=name1 + '_branch2b')
branch2b_feats = self.nn.batch_norm(branch2b_feats, name2 + '_branch2b')
branch2b_feats = tf.nn.relu(branch2b_feats)

branch2c_feats = self.nn.conv2d(branch2b_feats,
filters=4 * c,
kernel_size=(1, 1),
strides=(1, 1),
activation=None,
use_bias=False,
name=name1 + '_branch2c')
branch2c_feats = self.nn.batch_norm(branch2c_feats, name2 + '_branch2c')

outputs = inputs + branch2c_feats
outputs = tf.nn.relu(outputs)
return outputs

def build_rnn(self):
""" Build the RNN. """
print("Building the RNN...")
config = self.config

# Setup the placeholders
if self.is_train:
contexts = self.conv_feats
sentences = tf.placeholder(
dtype=tf.int32,
shape=[config.batch_size, config.max_caption_length])
masks = tf.placeholder(
dtype=tf.float32,
shape=[config.batch_size, config.max_caption_length])
else:
contexts = tf.compat.v1.placeholder(
dtype=tf.float32,
shape=[config.batch_size, self.num_ctx, self.dim_ctx])
last_memory = tf.compat.v1.placeholder(
dtype=tf.float32,
shape=[config.batch_size, config.num_lstm_units])
last_output = tf.compat.v1.placeholder(
dtype=tf.float32,
shape=[config.batch_size, config.num_lstm_units])
last_word = tf.compat.v1.placeholder(
dtype=tf.int32,
shape=[config.batch_size])

# Setup the word embedding
with tf.compat.v1.variable_scope("word_embedding"):
embedding_matrix = tf.compat.v1.get_variable(
name='weights',
shape=[config.vocabulary_size, config.dim_embedding],
initializer=self.nn.fc_kernel_initializer,
regularizer=self.nn.fc_kernel_regularizer,
trainable=self.is_train)

# Setup the LSTM
lstm = tf.nn.rnn_cell.LSTMCell(
config.num_lstm_units,
initializer=self.nn.fc_kernel_initializer)

if self.is_train:
lstm = tf.nn.rnn_cell.DropoutWrapper(
lstm,
input_keep_prob=1.0 - config.lstm_drop_rate,
output_keep_prob=1.0 - config.lstm_drop_rate,
state_keep_prob=1.0 - config.lstm_drop_rate)

# Initialize the LSTM using the mean context
with tf.compat.v1.variable_scope("initialize"):
context_mean = tf.reduce_mean(self.conv_feats, axis=1)
initial_memory, initial_output = self.initialize(context_mean)
initial_state = initial_memory, initial_output

# Prepare to run
predictions = []
if self.is_train:
alphas = []
cross_entropies = []
predictions_correct = []
num_steps = config.max_caption_length
last_output = initial_output
last_memory = initial_memory
last_word = tf.zeros([config.batch_size], tf.int32)
else:
num_steps = 1
last_state = last_memory, last_output

# Generate the words one by one
for idx in range(num_steps):
# Attention mechanism
with tf.compat.v1.variable_scope("attend"):
alpha = self.attend(contexts, last_output)
context = tf.reduce_sum(contexts * tf.expand_dims(alpha, 2),
axis=1)
if self.is_train:
tiled_masks = tf.tile(tf.expand_dims(masks[:, idx], 1),
[1, self.num_ctx])
masked_alpha = alpha * tiled_masks
alphas.append(tf.reshape(masked_alpha, [-1]))

# Embed the last word
with tf.compat.v1.variable_scope("word_embedding"):
word_embed = tf.nn.embedding_lookup(embedding_matrix,
last_word)
# Apply the LSTM
with tf.compat.v1.variable_scope("lstm"):
current_input = tf.concat([context, word_embed], 1)
output, state = lstm(current_input, last_state)
memory, _ = state

# Decode the expanded output of LSTM into a word
with tf.compat.v1.variable_scope("decode"):
expanded_output = tf.concat([output,
context,
word_embed],
axis=1)
logits = self.decode(expanded_output)
probs = tf.nn.softmax(logits)
prediction = tf.argmax(logits, 1)
predictions.append(prediction)

# Compute the loss for this step, if necessary
if self.is_train:
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
labels=sentences[:, idx],
logits=logits)
masked_cross_entropy = cross_entropy * masks[:, idx]
cross_entropies.append(masked_cross_entropy)

ground_truth = tf.cast(sentences[:, idx], tf.int64)
prediction_correct = tf.where(
tf.equal(prediction, ground_truth),
tf.cast(masks[:, idx], tf.float32),
tf.cast(tf.zeros_like(prediction), tf.float32))
predictions_correct.append(prediction_correct)

last_output = output
last_memory = memory
last_state = state
last_word = sentences[:, idx]

tf.compat.v1.get_variable_scope().reuse_variables()

# Compute the final loss, if necessary
if self.is_train:
cross_entropies = tf.stack(cross_entropies, axis=1)
cross_entropy_loss = tf.reduce_sum(cross_entropies) \
/ tf.reduce_sum(masks)

alphas = tf.stack(alphas, axis=1)
alphas = tf.reshape(alphas, [config.batch_size, self.num_ctx, -1])
attentions = tf.reduce_sum(alphas, axis=2)
diffs = tf.ones_like(attentions) - attentions
attention_loss = config.attention_loss_factor \
* tf.nn.l2_loss(diffs) \
/ (config.batch_size * self.num_ctx)

reg_loss = tf.losses.get_regularization_loss()

total_loss = cross_entropy_loss + attention_loss + reg_loss

predictions_correct = tf.stack(predictions_correct, axis=1)
accuracy = tf.reduce_sum(predictions_correct) \
/ tf.reduce_sum(masks)

self.contexts = contexts
if self.is_train:
self.sentences = sentences
self.masks = masks
self.total_loss = total_loss
self.cross_entropy_loss = cross_entropy_loss
self.attention_loss = attention_loss
self.reg_loss = reg_loss
self.accuracy = accuracy
self.attentions = attentions
else:
self.initial_memory = initial_memory
self.initial_output = initial_output
self.last_memory = last_memory
self.last_output = last_output
self.last_word = last_word
self.memory = memory
self.output = output
self.probs = probs

print("RNN built.")

def initialize(self, context_mean):
""" Initialize the LSTM using the mean context. """
config = self.config
context_mean = self.nn.dropout(context_mean)
if config.num_initalize_layers == 1:
# use 1 fc layer to initialize
memory = self.nn.dense(context_mean,
units=config.num_lstm_units,
activation=None,
name='fc_a')
output = self.nn.dense(context_mean,
units=config.num_lstm_units,
activation=None,
name='fc_b')
else:
# use 2 fc layers to initialize
temp1 = self.nn.dense(context_mean,
units=config.dim_initalize_layer,
activation=tf.tanh,
name='fc_a1')
temp1 = self.nn.dropout(temp1)
memory = self.nn.dense(temp1,
units=config.num_lstm_units,
activation=None,
name='fc_a2')

temp2 = self.nn.dense(context_mean,
units=config.dim_initalize_layer,
activation=tf.tanh,
name='fc_b1')
temp2 = self.nn.dropout(temp2)
output = self.nn.dense(temp2,
units=config.num_lstm_units,
activation=None,
name='fc_b2')
return memory, output

def attend(self, contexts, output):
""" Attention Mechanism. """
config = self.config
reshaped_contexts = tf.reshape(contexts, [-1, self.dim_ctx])
reshaped_contexts = self.nn.dropout(reshaped_contexts)
output = self.nn.dropout(output)
if config.num_attend_layers == 1:
# use 1 fc layer to attend
logits1 = self.nn.dense(reshaped_contexts,
units=1,
activation=None,
use_bias=False,
name='fc_a')
logits1 = tf.reshape(logits1, [-1, self.num_ctx])
logits2 = self.nn.dense(output,
units=self.num_ctx,
activation=None,
use_bias=False,
name='fc_b')
logits = logits1 + logits2
else:
# use 2 fc layers to attend
temp1 = self.nn.dense(reshaped_contexts,
units=config.dim_attend_layer,
activation=tf.tanh,
name='fc_1a')
temp2 = self.nn.dense(output,
units=config.dim_attend_layer,
activation=tf.tanh,
name='fc_1b')
temp2 = tf.tile(tf.expand_dims(temp2, 1), [1, self.num_ctx, 1])
temp2 = tf.reshape(temp2, [-1, config.dim_attend_layer])
temp = temp1 + temp2
temp = self.nn.dropout(temp)
logits = self.nn.dense(temp,
units=1,
activation=None,
use_bias=False,
name='fc_2')
logits = tf.reshape(logits, [-1, self.num_ctx])
alpha = tf.nn.softmax(logits)
return alpha

def decode(self, expanded_output):
""" Decode the expanded output of the LSTM into a word. """
config = self.config
expanded_output = self.nn.dropout(expanded_output)
if config.num_decode_layers == 1:
# use 1 fc layer to decode
logits = self.nn.dense(expanded_output,
units=config.vocabulary_size,
activation=None,
name='fc')
else:
# use 2 fc layers to decode
temp = self.nn.dense(expanded_output,
units=config.dim_decode_layer,
activation=tf.tanh,
name='fc_1')
temp = self.nn.dropout(temp)
logits = self.nn.dense(temp,
units=config.vocabulary_size,
activation=None,
name='fc_2')
return logits

def build_optimizer(self):
""" Setup the optimizer and training operation. """
config = self.config

learning_rate = tf.constant(config.initial_learning_rate)
if config.learning_rate_decay_factor < 1.0:
def _learning_rate_decay_fn(learning_rate, global_step):
return tf.train.exponential_decay(
learning_rate,
global_step,
decay_steps=config.num_steps_per_decay,
decay_rate=config.learning_rate_decay_factor,
staircase=True)

learning_rate_decay_fn = _learning_rate_decay_fn
else:
learning_rate_decay_fn = None

with tf.variable_scope('optimizer', reuse=tf.AUTO_REUSE):
if config.optimizer == 'Adam':
optimizer = tf.train.AdamOptimizer(
learning_rate=config.initial_learning_rate,
beta1=config.beta1,
beta2=config.beta2,
epsilon=config.epsilon
)
elif config.optimizer == 'RMSProp':
optimizer = tf.train.RMSPropOptimizer(
learning_rate=config.initial_learning_rate,
decay=config.decay,
momentum=config.momentum,
centered=config.centered,
epsilon=config.epsilon
)
elif config.optimizer == 'Momentum':
optimizer = tf.train.MomentumOptimizer(
learning_rate=config.initial_learning_rate,
momentum=config.momentum,
use_nesterov=config.use_nesterov
)
else:
optimizer = tf.train.GradientDescentOptimizer(
learning_rate=config.initial_learning_rate
)

opt_op = tf.contrib.layers.optimize_loss(
loss=self.total_loss,
global_step=self.global_step,
learning_rate=learning_rate,
optimizer=optimizer,
clip_gradients=config.clip_gradients,
learning_rate_decay_fn=learning_rate_decay_fn)

self.opt_op = opt_op

def build_summary(self):
""" Build the summary (for TensorBoard visualization). """
with tf.name_scope("variables"):
for var in tf.trainable_variables():
with tf.name_scope(var.name[:var.name.find(":")]):
self.variable_summary(var)

with tf.name_scope("metrics"):
tf.summary.scalar("cross_entropy_loss", self.cross_entropy_loss)
tf.summary.scalar("attention_loss", self.attention_loss)
tf.summary.scalar("reg_loss", self.reg_loss)
tf.summary.scalar("total_loss", self.total_loss)
tf.summary.scalar("accuracy", self.accuracy)

with tf.name_scope("attentions"):
self.variable_summary(self.attentions)

self.summary = tf.summary.merge_all()

def variable_summary(self, var):
""" Build the summary for a variable. """
mean = tf.reduce_mean(var)
tf.summary.scalar('mean', mean)
stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
tf.summary.scalar('stddev', stddev)
tf.summary.scalar('max', tf.reduce_max(var))
tf.summary.scalar('min', tf.reduce_min(var))
tf.summary.histogram('histogram', var)

同时还包括生成的图像描述转为中文的部分(因为训练集是图片->英文,所以需要图片->英文->中文),这一部分为了方便,直接使用阿里巴巴达摩院的API接口:

image

搜索方法

当用户进行搜索的时候,可以直接通过文本相似度进行相关能力的搜索:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# 搜索图片
@bottle.route('/picture/search', method='POST')
def searchPicture():
print("PATH: /picture/search")
try:
# 参数获取
postData = json.loads(bottle.request.body.read().decode("utf-8"))
print("PostData: ", postData)
secret = postData.get('secret', None)
keyword = postData.get('keyword', None)
page = postData.get("page", None)
try:
page = int(page)
except:
page = 1

# 参数校验
print('Check parameter')
if not checkParameter([secret, ]):
return False, response(ERROR['ParameterException'], 'ParameterException')

# 查看用户是否存在
print("Check User Information")
user = Action("SELECT * FROM User WHERE `secret`=? AND `state`=1;", (secret,)).fetchone()
if not user:
return response(ERROR['UserInformationError'], 'UserInformationError')

getPhotos = lambda photos: [{
"id": evePhoto['photo_id'],
"pictureThumbnail": SignUrl(thumbnailKey(evePhoto['file_token']), "GET", 600),
"pictureSource": SignUrl("photo/original/%s" % (evePhoto['file_token']), "GET", 600),
"date": evePhoto['update_time'].split(" ")[0][2:],
"location": evePhoto['album_place'] or "地球",
"album": evePhoto['album_name'],
"owner": True if evePhoto['type'] == 1 else False
} for evePhoto in photos]

if not page or page == 1:
print("Get Photo")
searchStmt = ("SELECT photo.description photo_description, photo.user_description photo_user_description, "
"album.name album_name, album.place album_place, album.description album_description, "
"photo.id photo_id, * FROM Photo AS photo INNER JOIN Album AS album INNER JOIN AlbumUser AS "
"album_user WHERE album.`id`=photo.`album` AND album_user.`album`=photo.`album` AND "
"album.`id`=photo.`album` AND album_user.`user`=? AND photo.`state`=1 ORDER BY -photo.`id`;")
photos = Action(searchStmt, (user["id"],)).fetchall()
resultDict = {}
searchKeyword = keyword.split(" ")
resultTemp = {}
documents = []
print("Format Photo Information")
for evePhoto in photos:
if not (len(evePhoto["password"]) >= 0 and evePhoto['type'] != 1):
tempSentence = ("%s%s%s%s%s" % (evePhoto["photo_description"],
evePhoto["photo_user_description"],
evePhoto['album_name'],
evePhoto["album_place"],
evePhoto["album_description"])).replace(" ", "")
resultTemp[tempSentence] = evePhoto
tempNum = 0
for eveWord in searchKeyword:
if eveWord in tempSentence:
tempNum = tempNum + 0.05
resultDict[tempSentence] = tempNum
documents.append(tempSentence)

print("Photo Prediction")

texts = [[word for word in document.split()] for document in documents]
frequency = defaultdict(int)
for text in texts:
for word in text:
frequency[word] += 1
dictionary = corpora.Dictionary(texts)
new_xs = dictionary.doc2bow(jieba.cut(keyword))
corpus = [dictionary.doc2bow(text) for text in texts]
tfIdf = models.TfidfModel(corpus)
featureNum = len(dictionary.token2id.keys())
sim = similarities.SparseMatrixSimilarity(tfIdf[corpus], num_features=featureNum)[tfIdf[new_xs]]
resultList = [(sim[i] + resultDict[documents[i]], documents[i]) for i in
range(0, len(documents))]
resultList.sort(key=lambda x: x[0], reverse=True)
result = []
for eve in resultList:
if eve[0] >= 0.05:
photo = resultTemp[eve[1]]
result.append({"photo_id": photo['photo_id'],
"file_token": photo['file_token'],
"update_time": photo['update_time'],
"album_place": photo['album_place'],
"album_name": photo['album_name'],
"type": photo['type']})

if not os.path.exists(searchTempDir):
os.mkdir(searchTempDir)

# ROW 转 JSON
with open(searchTempDir + secret + getMD5(keyword), "w") as f:
f.write(json.dumps(result))
return response(getPhotos(result[0:51]))
else:
with open(searchTempDir + secret + getMD5(keyword), "r") as f:
result = json.loads(f.read())
return response(getPhotos(result[page * 50:page * 50 + 51]))
except Exception as e:
print("Error: ", e)
return response(ERROR['SystemError'], 'SystemError')

管理系统开发

这一部分将会主要采用SQLite-Web项目直接实现。通过Github寻找到SQLite-Web项目的仓库地址:

image

并在sqlite_web目录下,安装相关的依赖:

1
2
3
flask
pygments
peewee

安装之后,直接上传项目:

image

项目预览

完成上岸所有步骤之后,我们项目也基本可以使用了,可以预览整个项目。

项目目前已经发到了微信小程序中:一册时光

image

小程序端

登陆注册页面:
image

相册预览,图片查看和上传页面:
image

相册管理与个人中心页面:
image

管理页面

登陆页:
image

首页:
image

数据库页:
image

经验积累

Web框架与阿里云函数计算

得益于阿里云函数计算拥有HTTP函数的福利,传统的Web框架上函数计算是非常方便的。

image

Bottle和Flask这类的框架,实际上从用户请求到框架本身的方法,大概分成了三个过程,分别是Web Server,提及Wsgi,和开发者所实现的方法,由于函数计算本身是事件触发,当HTTP函数给我们的Event即事件的时候,可以认为就已经是一个app了,所以针对Bottle和Flask等框架,我们可以这样做:

以Flask代码为例:

1
2
3
4
5
6
# index.py
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello_world():
return 'Hello, World!'

在函数计算的配置:
image

也就是说,我们只需要把函数入口写成:

1
文件名.Flask对象的变量名

同理,当我们有一个Bottle项目时,也可以用类似的方法:

1
2
3
4
5
6
# index.py
import bottle
@bottle.route('/hello/<name>')
def index(name):
return "Hello world"
app = bottle.default_app()

此时,函数入口只需要是:

1
文件名.默认APP的变量名

即:index.app即可。

也正是因为这种设计,才使得该项目的管理系统(基于Flask的项目)非常轻松的,部署上来。

如何进行本地调试

Serverless架构有一个备受吐槽的问题点:如何进行本地调试?

其实在这个项目开发过过程中,我的调试方案是比较简单和粗暴的,虽然不一定适用所有场景,但是大部分场景我相信还是可以尝试一下的。

针对Web框架的项目,他的调试方法其实就是直接把框架在本地启动,然后进行调试。例如在本项目中,我的同步接口是直接通过Bottle的run方法来进行的调试,即在本地启动一个Web服务。

针对非Web框架,我的调试方法则是在本地构建一个方法,例如我要调试一下对象存储触发器,我的调试方案是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import json


def handler(event, context):
print(event)


def test():
event = {
"events": [
{
"eventName": "ObjectCreated:PutObject",
"eventSource": "acs:oss",
"eventTime": "2017-04-21T12:46:37.000Z",
"eventVersion": "1.0",
"oss": {
"bucket": {
"arn": "acs:oss:cn-shanghai:123456789:bucketname",
"name": "testbucket",
"ownerIdentity": "123456789",
"virtualBucket": ""
},
"object": {
"deltaSize": 122539,
"eTag": "688A7BF4F233DC9C88A80BF985AB7329",
"key": "image/a.jpg",
"size": 122539
},
"ossSchemaVersion": "1.0",
"ruleId": "9adac8e253828f4f7c0466d941fa3db81161****"
},
"region": "cn-shanghai",
"requestParameters": {
"sourceIPAddress": "140.205.***.***"
},
"responseElements": {
"requestId": "58F9FF2D3DF792092E12044C"
},
"userIdentity": {
"principalId": "123456789"
}
}
]
}
handler(json.dumps(event), None)


if __name__ == "__main__":
print(test())

这样,我在实现handler方法的时候,就可以不断的通过运行该文件来进行调试,也可以根据自身的需求对Event等内容进行定制化的调整。

总结

随着Serverless架构的发展,Serverless可以在更多的领域发挥着更重要的作用。本文通过自身的需求,转换为项目开发,通过传统框架迁移到Serverless架构,通过项目的本地调试,通过对象存储、云硬盘等产品与函数计算的融合,实现了一个基于人工智能相册的小程序。受利于小程序本身的技术红利,叠加了Serverless架构的技术红利,使得该项目的研发效能飞速提升,并且具有极致弹性、按量付费、服务端免运维等有点。

在未来,在和朋友出去玩耍,可以共同维护一个相册,哪怕时光荏苒,岁月如梭,若干年之后,也可以通过搜索“坐在海边喝酒”,来找回自己的青葱岁月。

欢迎您关注我的博客,也欢迎转载该博客,转载请注明本文地址: http://bluo.cn/serverless-ai-album/ 。有关于Serverless等相关问题欢迎联系我:80902630

微信号抖音号