本方案采用Java+SpringBoot+MySQL+Elasticsearch7.17.9技术栈,所有组件安装均提供可直接执行的命令/资源,无需自行查找。
Elasticsearch安装:优先使用docker安装避免环境冲突,执行以下命令:
``` 拉取对应版本镜像 docker pull elasticsearch:7.17.9 启动单节点ES实例 docker run -d --name es-archive -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" elasticsearch:7.17.9 ```启动报错修复:若提示内存不足,执行`sysctl -w vm.max_map_count=262144`,永久生效则在`/etc/sysctl.conf`添加`vm.max_map_count=262144`,执行`sysctl -p`后重启ES即可。
安装完成后执行`curl http://localhost:9200`,返回包含version:7.17.9的内容即为安装成功。
IK分词器安装:直接下载对应版本安装包:https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.9/elasticsearch-analysis-ik-7.17.9.zip,将压缩包解压后放入ES容器`/usr/share/elasticsearch/plugins`目录,执行`docker restart es-archive`重启ES即可。
Logstash安装:用于同步MySQL数据到ES,下载地址:https://artifacts.elastic.co/downloads/logstash/logstash-7.17.9-linux-x86_64.tar.gz,解压即可使用。
MySQL驱动下载:Logstash同步需要MySQL驱动,下载地址:https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.28/mysql-connector-java-8.0.28.jar,放入Logstash的config目录下。
首先在MySQL中创建档案主表,可直接复制以下SQL执行:
```sql CREATE TABLE `archive_info` ( `id` bigint NOT NULL AUTO_INCREMENT COMMENT '主键ID', `archive_no` varchar(32) NOT NULL COMMENT '档案编号', `archive_type` varchar(16) NOT NULL COMMENT '档案类型:人事/项目/合同/财务', `owner_name` varchar(32) DEFAULT NULL COMMENT '所属人/所属项目名', `create_time` date NOT NULL COMMENT '归档日期', `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间', `storage_location` varchar(64) DEFAULT NULL COMMENT '存储位置', `keyword` text COMMENT '档案关键词', `content_summary` text COMMENT '档案内容摘要', `is_deleted` tinyint DEFAULT '0' COMMENT '是否删除', PRIMARY KEY (`id`), UNIQUE KEY `idx_archive_no` (`archive_no`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='档案主表'; ```在Logstash的config目录下新建`logstash-archive.conf`配置文件,直接复制以下内容并修改对应MySQL配置:
``` input { jdbc { jdbc_connection_string => "jdbc:mysql://你的MySQL地址:3306/archive_db?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai" jdbc_user => "你的MySQL账号" jdbc_password => "你的MySQL密码" jdbc_driver_library => "./config/mysql-connector-java-8.0.28.jar" jdbc_driver_class => "com.mysql.cj.jdbc.Driver" statement => "SELECT id,archive_no,archive_type,owner_name,create_time,storage_location,keyword,content_summary FROM archive_info WHERE is_deleted=0 AND update_time > :sql_last_value" schedule => "/5 " record_last_run => true last_run_metadata_path => "./config/archive_last_value" } } output { elasticsearch { hosts => ["http://localhost:9200"] index => "archive_index" document_id => "%{id}" } } ```
执行`./bin/logstash -f config/logstash-archive.conf`启动同步,5分钟后执行`curl http://localhost:9200/archive_index/_count`,返回count大于0即为同步成功。
执行以下请求创建ES索引,可直接复制到Postman/Kibana执行:
```json PUT /archive_index { "settings": { "number_of_shards": 1, "number_of_replicas": 0, "analysis": { "analyzer": { "ik_analyzer": { "type": "custom", "tokenizer": "ik_max_word", "filter": ["lowercase"] } } } }, "mappings": { "properties": { "archive_no": {"type": "keyword"}, "archive_type": {"type": "keyword"}, "owner_name": {"type": "text", "analyzer": "ik_analyzer", "search_analyzer": "ik_smart"}, "create_time": {"type": "date", "format": "yyyy-MM-dd"}, "storage_location": {"type": "text", "analyzer": "ik_analyzer"}, "keyword": {"type": "text", "analyzer": "ik_analyzer"}, "content_summary": {"type": "text", "analyzer": "ik_analyzer"} } } } ```SpringBoot项目首先引入ES依赖:
```xml在application.yml中添加ES配置:
```yaml spring: elasticsearch: uris: http://localhost:9200 connection-timeout: 5s socket-timeout: 30s ```创建检索参数DTO:
```java @Data public class ArchiveSearchDTO { private String archiveNo; // 精确匹配档案编号 private String archiveType; // 精确匹配档案类型 private String keyword; // 全文模糊匹配 private String startDate; // 归档起始日期 private String endDate; // 归档结束日期 private Integer pageNum = 1; private Integer pageSize = 10; } ```核心检索业务代码,可直接复制使用:
```java @Service public class ArchiveSearchService { @Resource private ElasticsearchRestTemplate elasticsearchRestTemplate; public Page同义词检索配置:在ES的config目录下新建`analysis/synonym.txt`,添加同义词规则,例如`人事,员工,职工\n合同,协议,契约`,修改索引setting的analyzer添加同义词过滤器,重启ES后即可支持同义词匹配。
POST请求地址`/api/archive/search`,请求体示例:
```json { "archiveType": "人事", "keyword": "张三 2023", "startDate": "2023-01-01", "endDate": "2023-12-31", "pageNum": 1, "pageSize": 10 } ```正常返回2023年所有包含张三的人事档案,且关键词标红即为功能开发完成。