Archive

43 posts in total

2026 (3 posts)

05-06

Getting Started with PaddleOCR: Conda Env + GPU Inference End-to-End

A complete log of setting up a PaddleOCR development environment from scratch: Conda isolation, PaddlePaddle GPU installation, CLI verification, and Python API integration.

PaddleOCR OCR Conda Deep Learning

04-30

I've Joined the Vibe Coding Wave

From web-based chat editing to Cline, then to Cursor — a personal account of embracing AI coding agents and how they genuinely transformed my development workflow.

AI Vibe Coding Cursor Productivity

04-29

I Declare KDE Plasma the Best Desktop Environment in the World

A log of pitfalls encountered upgrading from Fedora 36 all the way to 43, plus a surprisingly pleasant experience with the stability of the latest KDE Plasma.

Linux Fedora KDE Desktop Environment

2025 (17 posts)

12-31

Setting Up a VSCode, Qt, and CMake Development Environment

This video tutorial demonstrates how to configure your development environment using VSCode, Qt, and CMake for efficient C++ cross-platform development.

C++ Qt VSCode CMake

12-30

Recent Advances and Trends in Emotion Recognition Research

This article provides a comprehensive overview of recent breakthroughs in emotion recognition, covering multimodal reasoning, audio-driven facial animation, EEG analysis, and the ethics of affective computing.

12-30

arXiv Highlights: Recent Advances, Trends, and Future Directions in Contrastive Learning

This edition summarizes key papers in contrastive learning, medical image segmentation, graph neural networks, and generative models, highlighting the shift toward robust, cross-modal, and theory-driven machine learning.

12-30

The Background and Significance of Micro-expression Research: An Interdisciplinary Journey

This article provides an in-depth look at the definition, psychological foundations, and current state of micro-expression research in computer vision, highlighting its applications in public safety, mental health, and human-computer interaction.

12-30

AI Research Roundup: The Frontier of Diffusion Models (End of 2025)

This article reviews the latest research progress in diffusion models as of late 2025, covering key advancements in video super-resolution, 3D generation, robotic control, physical simulation, and model safety.

12-30

Fixing TensorFlow I/O Compatibility Errors

Encountering an 'undefined symbol' error when importing TensorFlow I/O in Kaggle? Learn how to resolve this compatibility issue by matching the correct library versions.

Kaggle tensorflow-io Python

12-30

Frontiers in Embodied AI: From 3D Scene Generation to Multi-Agent Collaboration

This article provides a comprehensive overview of recent breakthroughs in Embodied AI, covering key areas such as 3D scene generation, Vision-Language-Action (VLA) models, multi-agent systems, spatial intelligence, and AI safety.

12-30

The Hook Model: Why We Can’t Stop Scrolling and How to Reclaim Your Focus

By deconstructing the Hook Model, this article explores the psychological mechanisms behind digital addiction and offers actionable strategies to regain control over your attention.

行为设计产品设计数字成瘾

12-30

Frontiers in Multimodal Learning: From Controllable Generation to Autonomous Reasoning

This article summarizes key recent advancements in multimodal learning, covering controllable generation, autonomous memory, robustness, multimodal reasoning, and safety detection, while analyzing evolving research trends.

12-30

How to Force Git Pull and Overwrite Local Changes

Learn how to force update your Git repository to match the remote branch by discarding local changes, along with safer alternatives for preserving your work.

Git

12-30

Frontiers in Reinforcement Learning: From Embodied Robotics to Multi-Agent Coordination

This summary highlights recent breakthroughs in reinforcement learning, covering embodied AI, multi-agent systems, offline learning, communication optimization, and the integration of generative models.

12-30

Advances in Remote Sensing: From Multimodal Reasoning to Disaster Perception

This summary covers recent breakthroughs in remote sensing, including large-scale disaster datasets, multimodal geospatial reasoning models, hyperspectral image restoration, and autonomous drone navigation.

12-30

Advances in Remote Photoplethysmography (rPPG): A 2025 Research Overview

This article surveys key 2025 research in remote photoplethysmography (rPPG), covering multimodal fusion, lightweight model architectures, robustness in dynamic environments, and clinical validation.

12-30

The Death of Laplace's Demon - How Quantum Mechanics Shattered the Omniscient Predictor Myth

Laplace's Demon was once the ultimate symbol of scientific determinism, but quantum mechanics dismantled this fantasy. This article explores how Heisenberg's uncertainty principle, quantum randomness, and entanglement prove that an omniscient predictor is physically impossible.

12-30

Frontiers in Video Generation and Multimodal Understanding

This roundup highlights recent advancements in video generation, multimodal understanding, and benchmarking, covering instruction-guided editing, scientific video reasoning, and efficient compression.

12-30

arXiv Highlights: Latest Research Trends in Vision-Language Models and Embodied AI (May 2025)

This article provides a curated overview of the latest research in Vision-Language Models (VLM) and Embodied AI as of May 2025, covering key areas such as 3D scene generation, multi-agent collaboration, robotic control, and security.

12-30

Why Is the Speed of Light Constant? Unveiling the Universe's Ultimate Speed Limit

Why is the speed of light a universal constant? This article explores the core principles of Special Relativity and how the invariance of light speed reshaped our understanding of space and time.

2023 (3 posts)

04-11

Fixing Git Clone Failures for OpenVINO - RPC failed curl 56

If you encounter a curl 56 network error while cloning the OpenVINO repository from Gitee, you can resolve it by adjusting your Git buffer configuration.

OpenVino Docker GPU Docker

04-08

How to Run Docker Without Sudo

By default, Docker requires sudo to execute commands. This guide explains how to enable non-root access by adding your user to the docker group and troubleshooting common permission issues.

Docker GPU Docker

03-31

How to Enable CUDA GPU Acceleration for face_recognition

The face_recognition library relies on CPU by default. This guide explains how to enable GPU acceleration by recompiling dlib with CUDA support.

Docker GPU Docker Web Camera face_recogntion

2022 (11 posts)

11-03

Deploying GPU and Web Camera Applications with Docker

This guide explains how to containerize Python applications that require GPU acceleration and USB camera access using Docker, including private registry configuration and disk management.

Docker Image Recognition GPU Docker Web Camera

08-11

Guide to Compiling and Installing GNUPlot

This guide explains how to compile and install GNUPlot 5.4.3 from source on CentOS 8 or AWS Linux 2, including how to resolve libgd compatibility issues.

GNUPlot CentOS 8 aws-linux-2

07-04

Compiling FFmpeg with CUDA Support on CentOS 8

This guide provides step-by-step instructions for compiling and installing FFmpeg with CUDA hardware acceleration support on CentOS 8.

FFMepeg CUDA CentOS 8

06-15

A Beginner's Guide to C++ Regular Expressions

This article provides an introduction to using the C++ `<regex>` standard library for string pattern matching and manipulation, covering basic usage and core components.

Regex 正则表达式 C-Cpp

06-15

Adapting CvxText for Drawing Chinese Characters in OpenCV 4.5

This article explains how to update legacy CvxText code to function correctly in OpenCV 4.5 by resolving header file inclusions and data type conversion issues.

CvxText OpenCV 4.5 C-Cpp

06-15

Performing FTP Operations with the POCO Library

This guide demonstrates how to implement FTP file uploads using POCO, a lightweight and flexible C++ network library.

FTP poco C-Cpp

04-14

Curved Text Detection with PaddleOCR

Text detection is a classic computer vision task, and curved text presents a unique challenge due to its free-form nature. This guide demonstrates how to perform curved text detection using PaddleOCR.

Deep Learning Inference OCR Curved Text Detection PaddleOCR

03-26

Tips for Converting NumPy to OpenCV Mat

This guide explores how to map common NumPy array operations—such as Sigmoid functions, channel slicing, and conditional filtering—to C++ using OpenCV's cv::Mat.

Numpy to C++ cv::Mat

03-09

A Simple Guide to Displaying OpenCV Video Streams with PySide6 in Python

Learn how to display real-time video streams from a camera in a Qt interface using OpenCV and the official PySide6 library in this concise guide.

Python Qt OpenCV Image Processing

03-04

MMOCR Installation and Training on Custom Datasets

This guide walks you through installing the MMOCR framework and demonstrates how to convert PASCAL VOC annotations to COCO format for custom model training.

MMOCR OCR Pytorch

02-08

Setting Up a GitLab Server on Ubuntu 16.04 in a Local LAN

A concise guide to installing and configuring GitLab on a local Ubuntu 16.04 server without a domain name, bypassing the complexity of standard documentation.

GitLab Git Ubuntu 16.04

2021 (8 posts)

08-26

Handling ROI Clipping Elegantly with the OpenCV & Operator

Learn how to use the `&` operator in OpenCV to perform intersection operations on `cv::Rect`, allowing for elegant image ROI clipping and boundary validation without verbose conditional logic.

C-Cpp Operator & OpenCV Image Processing

08-24

A Guide to the MNIST-ROT Dataset

This post introduces the MNIST-ROT (Rotated MNIST) dataset, a standard benchmark for evaluating rotation-equivariant algorithms, and provides the necessary download links.

PyTorch MNIST-ROT Deep Learning Neural Network Training

08-24

A Guide to PyTorch Learning Rate Schedulers

An overview of common PyTorch learning rate scheduling strategies and how to implement them to optimize the training process, using StepLR as a practical example.

Python PyTorch lr_scheduler Deep Learning

08-23

OpenCV DNN Batch Inference: A Guide for Image Classification

This guide demonstrates how to use the OpenCV DNN module for image classification, covering core APIs, the differences between Mat and Blob formats, and the implementation of both single-image and batch inference.

C-Cpp OpenCV Image Classification Batch Inference

08-21

Qt HTTP Server Example: Implementing Client POST and Server Response

A guide on how to perform HTTP POST requests with JSON payloads in Qt and how to build a corresponding local server using the QtHttpServer module.

C-Cpp Qt QtHttpServer CMake

08-20

Managing C/C++ Projects with CMake: From Basics to Library Integration

A practical guide on using CMake to manage C/C++ projects, covering project setup and integration with popular libraries like OpenCV, Boost, Qt, and CUDA.

C-Cpp CMake Qt OpenCV

08-18

Integrating QtHttpServer into a Dynamic Library with C APIs: Handling QCoreApplication Dependencies

Learn how to compile the QtHttpServer module and solve the QCoreApplication event loop initialization challenge when wrapping Qt functionality into a C-compatible dynamic library.

C-Cpp Qt QtHttpServer CMake

08-16

Compiling OpenCV with CUDA Support using CMake and VS2019 on Windows 10

A step-by-step guide on how to compile a custom OpenCV library with CUDA and DNN acceleration support from source using CMake and Visual Studio 2019.

C-Cpp CMake Win10 OpenCV

2012 (1 post)

05-10

Configuring FFTW 3.3.2 in Visual Studio 2010

A step-by-step guide on how to install and configure the FFTW 3.3.2 library in a Visual Studio 2010 environment, including .lib file generation and project property setup.

FFTW 3.3.2 VS2010 Visual Studio 2010 C-Cpp