Author: kiwi

  • SSPAI Review | Best New Apps to Try This Week

    SSPAI Review | Best New Apps to Try This Week

    Vibe Island: An AI Agent Assistant Living in the Dynamic Island

    • Platform: macOS
    • Keywords: Dynamic Island, AI Agent

    @Vanilla: In the era of Vibe Coding, products capable of delivering various functionalities are no longer scarce. The massive surge in app review volumes on the App Store this year shows that creating applications has almost become a zero-barrier task. However, apps with taste are still “works of art” in this world—people like me, without programming or design expertise, still can’t create something truly refined.

    In my view, Vibe Island is exactly such a tasteful creation. While using the MacBook notch area to display and interact with AI Agents is nothing new—and there are already open-source, free alternatives on the market—developer Edward Luo has crafted Vibe Island into a beautifully designed, feature-rich product that stands out among its peers.

    After installation, Vibe Island automatically establishes connections with local Agents or terminal applications via hooking, requiring no setup or deployment to start using it. Its functionality can be divided into four main parts: monitoring, authorization, inquiry, and navigation. Let’s go through them one by one.

    In the monitoring section, Vibe Island displays the current session’s title and task overview, allowing you to quickly track the progress of AI Agent tasks through the Dynamic Island. It currently supports ten AI Agents, including Claude Code, Codex, Gemini CLI, Cursor, OpenCode, Droid, Qoder, Copilot, CodeBuddy, and Kiro, as well as 13 terminal applications such as iTerm2, Ghostty, Warp, Terminal.app, VS Code, and Cursor. All of these can be distinguished within the Dynamic Island, helping you clearly manage multi-threaded tasks.

    In the authorization section, Vibe Island displays permission prompts directly in the Dynamic Island, offering options like Deny, Allow Once, Allow Always, and Auto-Approve. You can also confirm actions quickly using keyboard shortcuts. This brings two key benefits: first, you can respond to permission requests immediately, preventing tasks from being paused for too long; second, you don’t need to open the terminal app to grant permissions, so your current workflow remains uninterrupted.

    The inquiry section works similarly to authorization. You can respond to questions from the AI Agent directly within the Dynamic Island, helping guide the agent’s next steps according to your preferences while reducing unnecessary interaction with terminal apps.

    Finally, the navigation feature allows you to click on any item shown in the Dynamic Island to jump directly to the corresponding session in the terminal application, making it easy to view details or input commands.

    In the settings panel, Vibe Island offers several essential customization options. For display, you can adjust the notch style between compact and detailed modes; in the panel section, you can modify font size, completed card height, and maximum panel height; for AI Agents, you can enable or disable detailed activity display.

    In the sound and shortcut settings panel, Vibe Island provides a wide range of options to meet personalized needs.

    As a refined example of Vibe Coding, Vibe Island is a native macOS application written in Swift and optimized for Apple silicon. It is under 50MB in size and avoids the heavy memory usage typical of Electron apps, delivering fast responsiveness and smooth performance.

    Vibe Island is available for download and trial on its official website. The current early-bird price is $14.99 per device, with an additional $10 for each extra device license.

    mynd: Turning Everything Into a Chat-Based Record

    • Platform: iOS
    • Keywords: journaling, logging

    @ElijahLee: Well-known blogger cbvivi recently developed and released a simple app called mynd, which allows users to record their day, moods, thoughts, and all kinds of random notes through a chat-like interface.

    mynd is extremely minimal, with a size of just 5.9MB. Opening the app takes you straight into a conversation screen, where you can record everything as if you were chatting. Each message you send becomes part of the chat history. What’s fascinating is that after some time, your earlier messages will appear as if they were sent by another “you” to your present self. The current “you” can reply or start a new topic—like a dialogue between versions of yourself across time.

    The name mynd comes from a simplified version of “mymind.” Inside the app, there is only you. Like a private tree hollow, you can send anything to mynd, and only you will know. There are no reminders, no streaks, no AI—nothing that might feel like a burden.

    The app supports sending text, images, videos, and voice recordings, all accessible via the “+” button on the left side of the input field. Messages can be edited, quoted, or pinned, making it as convenient as using WeChat. Swipe a message to the right, and the blue chat bubble turns yellow, marking it as a highlight that will stand out in your history. Beyond simple recording, messages can also be turned into to-do items—just double-tap a message to bring up a task circle.

    In the top-right corner, you can access the content and notes categorization page, where mynd automatically organizes entries into pinned, to-do, highlights, excerpts, and more. You can also manually create collections to group notes by theme or category.

    mynd offers home screen widgets and Shortcuts support for quick entry. The widgets can also randomly display past entries and highlights. The app includes multiple color themes, extensive customization options, automatic iCloud backup, and multi-device syncing.

    You can download mynd for free on the App Store. A one-time payment of 18 RMB unlocks premium features, including iCloud sync, full feature access, and avatar customization.

    LibreFit: Fitness, with a Plan

    • Platform: Android
    • Keywords: fitness

    @Peggy_: It all started when I found a barbell in the storage room. Its appearance meant I could finally try some weight training I’d never done before—and perhaps go from beginner to giving up step by step. But when faced with the overwhelming number of training plans out there, I got stuck: how do I remember the sequence, duration, and sets for each workout? I believe many fitness enthusiasts have their own routines and workflows. LibreFit, the app introduced today, helps present these routines clearly on your phone screen, eliminating the need to memorize your training process every time.

    First, you need to create your own workout plan within the app. After naming it and adding notes, you can begin adding exercises. LibreFit’s advantage lies in its built-in library of common exercises. All you need to do is tap and add the ones you need into your plan. Of course, you should still have a clear idea of your training flow to avoid blindly adding too many exercises, which could make your routine ineffective. LibreFit also includes a simple exercise filter, allowing you to quickly find suitable movements based on your ability level, preferences, and target muscle groups.

    Even more thoughtfully, tapping into an exercise’s details reveals descriptions and the muscle groups involved. However, the demo videos are AI-generated and should be used for reference only. It’s recommended to carefully read the key points of each movement and, if possible, train under professional guidance.

    Once you’ve added your exercises, you can move to the workout list to configure sets, duration, and rest intervals for each movement. These settings vary depending on the type of exercise—for example, machine-based workouts usually require weight and reps, while bodyweight exercises may only need reps, and certain movements (like mountain climbers) require duration per set.

    After starting a workout, you can manually mark each set as completed. The app will automatically begin timing the rest interval and display your overall progress at the top. As you accumulate more sessions and training time, LibreFit also generates statistics to enhance visual feedback and help maintain motivation.

    If you’re looking for a fitness-focused app, you can download LibreFit via F-Droid and try it out—it’s completely free.

    Dropzone 5: A Brand-New Interface and New Customization Features

    • Platform: macOS
    • Keywords: file transfer, file operations

    @化学心情下2: The long-standing file staging and processing tool Dropzone has recently received a major 5.0 update. A completely redesigned interface is undoubtedly one of the biggest highlights of this version. Dropzone 5 introduces a Liquid Glass design style inspired by macOS Tahoe, along with a grid-based layout for information, and overall smoother animations during operation.

    The grid system is Dropzone’s most important feature. Through it, you can build specific workflows, and this system has been further enhanced in Dropzone 5. You can switch between different grids via the top dropdown menu, and even create dedicated grids for specific actions, allowing you to quickly switch and execute tasks when needed.

    The grid system is also highly customizable. You can adjust the number of columns, define the position of each row (for example, placing folders or apps in the top row), and most importantly, tailor it to quickly perform the actions you want.

    macOS Tahoe supports custom folder icons and colors, and Dropzone 5 integrates this feature into its grid system. You can sync the adjustments you’ve made to folders in Finder directly within the grid, allowing you to instantly recognize their purpose and function when opening it.

    The final new feature follows the recent CLI trend—Dropzone 5 introduces its terminal tool, dz. You can run dz commands in your desktop terminal to perform Dropzone actions, such as adding files to the Drop Bar or switching between grids. If you have development experience, you can even build automation scripts based on this tool.

    You can download and try Dropzone 5 from the Mac App Store or Setapp. The developer is currently offering a launch discount: new users can get 30% off, while Dropzone 4 users can upgrade to version 5 at half price.

    Infuse 8.4: Automatically Skip Intros and Credits

    • Platform: iOS / iPadOS / tvOS / macOS / visionOS
    • Keywords: media player, media management

    @Snow: Last month, the video player Infuse was updated to version 8.4, introducing two highly practical features: expanded extras content display and automatic skipping of intros and credits. The quotation marks are intentional, as both features had already been gradually tested in previous versions through phased rollouts across different channels—this update essentially “completes” their global implementation.

    As early as version 7.7.x, Infuse had begun supporting intro and credit skipping for third-party services such as Emby, Plex, and Jellyfin. With the help of the TheIntroDB database, this feature is now fully supported across the app. You can adjust it in “Preferences – Playback Settings – Skip Intros and Credits.” However, TheIntroDB’s dataset is still somewhat limited, especially for content from Chinese streaming platforms, where recognition rates remain low. If your videos are not correctly identified, you can register at theintrodb.org and manually submit timing data. Once approved, it will be automatically applied in Infuse.

    Infuse offers four options: On, Automatic (Delayed), Automatic (Immediate), and Off. When set to On, the app displays a floating “Skip” button when it detects intros, credits, or recaps. Tapping it skips the segment; otherwise, the video continues normally, and the button disappears after a short time. With Automatic (Delayed) and Automatic (Immediate), the app skips detected segments automatically. The difference is that Automatic (Delayed) displays the button for three seconds, allowing you to cancel the skip if desired—offering a balance between convenience and control. Personally, I recommend this option.

    When Infuse introduced its redesigned UI in version 8.0, trailers were already presented as a key feature on the detail page. With version 8.4, the app further improves the integration of extras such as behind-the-scenes clips, deleted scenes, and special episodes. These are now automatically displayed alongside the main content or within the same series, rather than appearing as separate entries or nonexistent episodes as in previous versions.

    However, Infuse still relies on file naming conventions and folder structures. Third-party services like Emby typically standardize this upstream. If you manage your own library, you’ll need to use naming rules such as the “Extras” folder or “S00” season to organize content. Unlike some domestic apps, it does not yet use AI to automatically sort and categorize media collections.

    In this update, Infuse finally introduces separate language settings for metadata and poster artwork. In “Preferences – Language,” you can set different languages for audio, subtitles, metadata, and posters. Since many films lack Simplified Chinese posters, previously setting both metadata and artwork to Simplified Chinese often resulted in a mix of Simplified, Traditional, and English posters. Now, you can set poster artwork to English for a more consistent visual presentation.

    Unfortunately, the “original language” option available for metadata matching is not yet supported for poster language settings. If you’re aiming for a fully authentic poster wall, you may need to wait for future updates.

    You can download Infuse for free from the App Store.

    Fucks Given: Give a “Middle Finger” to the Little Things That Don’t Matter

    • Platform: Android
    • Keywords: daily logging

    If you feel exhausted by life or work, it might not actually be life or work itself—but a specific thing that keeps draining your energy. Maybe it’s someone cutting in line during rush hour, an argument on social media, or a spilled bag of takeout soup. These seemingly small “annoyances” can loop in your mind and quietly consume your mental energy. If we don’t handle these everyday disruptions well, each future encounter with similar trivialities will gradually drain the energy that should be making us happier.

    If you want to stop this kind of “mental drain,” why not start by keeping an “account”? Try this bluntly named app, Fucks Given (if you’re familiar with English slang, you’ll get the reference). What first caught my attention was a line from the developer: “Self-love is the greatest middle finger of all time.” In other words, learning to love yourself is the ultimate way to push back against everything that bothers you.

    The purpose of this app is simple: to record how many times you’ve cared about things that weren’t worth it. In essence, it’s a form of “emotional bookkeeping.” Whenever you notice yourself spiraling over something unpleasant or someone you don’t like, just open the app and write down why you’re feeling annoyed. That’s it—there are no complicated features, just pure recording.

    However, I think this act of “logging it” is actually a way to shift your attention. In the past, you might stew over something for hours, but with this app, the moment you record your frustration, you’ve already stepped out of that emotional loop. It’s like reminding yourself: “There I go again, wasting energy on something like this.” Looking back at your “emotional ledger” over a week or a few months, you might realize just how much energy you’ve spent on things that didn’t deserve it.

    Of course, you can also treat Fucks Given as a dumping ground for negative emotions—or even a diary. Anything that drains your energy can be recorded here, helping you process or ignore it, and ultimately care for yourself better. Since these emotions are often private, the app thoughtfully includes password protection, preventing others from snooping through your phone (see, Fucks Given already helps you avoid one unnecessary annoyance).

    The app is open-source, free, ad-free, and very lightweight. You can download it from F-Droid or GitHub, and it’s also available on Google Play under a more “polite” name. If you’ve been feeling irritable or low lately, try using it to confront where your energy is going. After all, our energy is limited—don’t waste it carelessly.

    Tripsy: Calendar Integration and Itinerary Duplication

    • Platform: iOS, macOS, iPadOS
    • Keywords: travel planning, travel logging

    @ElijahLee: The travel planning app Tripsy has released version 3.8, introducing features such as calendar integration and a redesigned itinerary duplication workflow.

    Integration with Apple Calendar has long been a common feature in third-party apps, so Tripsy’s support for it comes a bit late. After updating to Tripsy 3.8, calendar events are enabled by default. Within a trip, you can go to the “Itinerary” page to see Apple Calendar events interleaved with Tripsy’s own travel events. Tapping on a calendar event reveals details such as the event name, notes, location, and links. If the event is a flight, Tripsy will automatically recognize and display flight information.

    In Tripsy, you can also choose which calendars to integrate. However, the toggle for this feature is somewhat hidden—the app does not provide a global switch in the main settings. Instead, you need to go to the “Itinerary” page, tap the “…” button in the bottom-left corner, and enter Preferences to access the calendar configuration page. There, you can enable or disable integration and select which calendars to display.

    The second new feature is itinerary duplication. For frequent business trips or favorite travel routes, this feature allows you to duplicate an entire itinerary or selectively copy specific events into a new trip.

    Within a trip, go to the “Itinerary” page, tap the “…” button in the top-left corner, and select “Duplicate Itinerary.” From there, you can choose which types of events to include—such as flights, accommodations, meetings, or restaurants. You can then modify details like travel dates, total cost, notes, and booking codes, as well as rename the trip and select a cover image. Essentially, all configurable aspects of an itinerary can be selectively copied, offering fine-grained control over duplication.

    In other areas, Tripsy offers both relaxed and compact views on the “Itinerary” page. The compact view can hide location details, and in Preferences, you can disable city names to make the interface cleaner. The new flight selector has also been improved, displaying detailed information such as national flags, full airport names, IATA codes, ICAO codes, and city names, helping users quickly identify flights.

    You can download Tripsy for free from the App Store. A subscription unlocks the Pro version, with flexible pricing options: 38 RMB per month, 198 RMB per year, or a one-time purchase of 998 RMB. Among the new features in version 3.8, advanced calendar integration options are available only to Pro users. Tripsy is also included in Setapp, where subscribers can download and use it at no additional cost.

    App Updates

    Multiple apps in Apple Creator Studio have received updates. Logic Pro adds Dolby Atmos mixing preview, while Pixelmator Pro introduces templates for the iPhone 17 series devices.
    Procreate recently announced that its sixth major version is currently in development and will be released next year. It also confirmed that Procreate 6 will include a Mac version, with further updates to be shared gradually on its official forum.

  • City Walk Guide: Finding Jiangnan’s Soul in Rich, Saucy Flavors

    City Walk Guide: Finding Jiangnan’s Soul in Rich, Saucy Flavors

    Taking advantage of the long weekend at the end of April, I made a trip to Nanchang and Ganzhou. Jiangxi had never really been on my travel list before. It wasn’t that I was put off by those online jokes about the province being “surrounded by a ring of development,” but rather that I simply didn’t know much about it—enough to never feel the urge to explore.

    The turning point came late one night while scrolling through videos. It showed a humble eatery tucked inside an old residential compound in Nanchang—just a few tables set out in the open, with two aging standing fans creaking away. On screen, the host mixed a bowl of rice noodles. Maybe it was the lighting, or maybe the seasoning had some secret to it—the noodles, coated in sauce, glistened under the light, with slices of pork stomach and chopped scallions scattered on top. As someone who loves mixed noodles, I could almost smell it through the screen. I dug a little deeper and realized that Jiangxi stir-fries weren’t all as “insanely spicy” as rumored. Plus, after over a year of “training” from my girlfriend from Hunan, I had gained some confidence in my spice tolerance.

    That moment was all it took for me to decide—I had to go and taste it for myself.

    Day 0: Comfort in a Late-Night Arrival

    \With limited vacation days like any office worker, I took an evening high-speed train from Shenzhen, arriving in Nanchang well past 11 p.m. It had just rained, and the air carried a cool breeze mixed with the earthy scent of damp soil. On the taxi ride to the hotel, we couldn’t wait and ordered delivery: two bowls of mixed noodles and a jar of egg-and-minced-pork soup.

    By the time the food arrived, the noodles had clumped a bit, but after a few vigorous stirs, they regained a soft, chewy texture. The peanuts were glossy and crunchy, the pickled radish added a sharp kick. I chose the pork stomach version—the meat was tender, not exactly mind-blowing, but for a travel-worn night, it hit the spot. The soup, with its fatty pork patty slowly simmered into the broth, offered a warm, comforting sip that balanced out the direct spiciness of the noodles.

    Day 1: Freshly Stir-Fried Toppings, Museums Between Eras, Tengwang Pavilion, and Wanshou Palace

    We slept in the next morning. As forecasted, a fine drizzle hung outside the window. After a quick wash, we headed to a nearby noodle shop to try freshly made versions.

    The moment we stepped in, the temperature inside felt several degrees warmer. In the corner, a cook was stir-frying toppings over high heat, waves of hot air rolling toward us. I ordered beef brisket mixed noodles, while my girlfriend chose minced pork, and we added a jar of peanut pork rib soup as usual.

    Freshly made beef brisket noodles were on another level. A quick toss with chopsticks and the pale noodles at the bottom were instantly coated in thick, oily sauce, turning a deep, glossy brown. One big slurp and the aroma of scallion oil and soy sauce surged upward. The brisket leaned toward the leaner cut—tender without being tough—its braised flavor blending with the rich, slick noodles in a way that felt layered and satisfying, far beyond the delivery version. My girlfriend’s minced pork noodles were just as addictive.

    In the afternoon, we visited the art museum. What left the deepest impression on me was a crossover exhibition combining ceramic craftsmanship with the Dunhuang Mogao Caves. Buddhist figures were fired onto ceramic tiles, and when artificial light shone from above, the kiln-fired texture gave the Bodhisattvas a kind of frozen, tactile presence. The roughness left by the kiln fire added a sense of weight—more “lived-in” than painted figures on paper. I couldn’t help but wonder: if artisans from over a thousand years ago saw this fusion today, they’d probably find it fascinating.

    The museum building itself was also worth noting. Completed in 1969, it blends Soviet-style architecture with Chinese elements like a “mountain”-shaped layout and pavilion-like roofs with extended eaves. Taking the elevator up from the metro entrance, watching the iconic red star on the rooftop gradually emerge from behind the eaves felt like a cinematic long take.

    The following museum offered a concise overview of Jiangxi’s history, including the origin of its name—derived from “Jiangnan West Circuit.” Over the next few days, I would come to feel the blend of orderly Song dynasty architectural aesthetics with the everyday warmth and the lingering charm of misty Jiangnan.

    We returned to the hotel around 4 p.m. to rest. Two hours later, the rain stopped, and a few rays of sunset broke through the clouds, casting light over the monument and the gradually illuminated museum.

    We found a quiet spot a few hundred meters from Tengwang Pavilion. From afar, the wooden structure’s joinery was clearly visible, with the Gan River and modern buildings across the bank forming a dialogue between past and present. Today’s pavilion is nothing like it was over a thousand years ago—but I found that acceptable. Even with an elevator installed, a hundred years from now, it will still be a historical structure—just one with an elevator. Looking up at it from below, its imposing grandeur suddenly made me understand how Wang Bo could write such majestic lines. The colorful lighting, however, felt a bit tacky—warmer tones might have suited it better. Since I couldn’t recite the “Preface to Tengwang Pavilion” to get free admission, I reluctantly skipped going inside, circling around instead, with a hint of regret over the ticket price.

    We then headed to Wanshou Palace. The area has been commercially developed, with lights and crowds intertwining, yet it hasn’t lost the charm of an old city. Passing by a small building, we noticed a performer sitting by a second-floor window, veiled lightly, playing the erhu with lowered eyes. The melody drifted down into the street, unexpectedly clear amidst the noise, like a fine thread tugging gently at the heart. Some people stopped to listen, others raised their phones, while some simply stood quietly. That bright, lingering sound softened both the body and mind—reminding me that beyond its bold, fiery flavors and lively street life, Jiangxi also holds a gentle, flowing grace.

    Too tired to go out again that night, we ordered delivery once more. A few dishes stood out:

    Braised beef brisket with fried eggs: the egg’s surface puffed into airy pockets, soaking up the rich meat broth—an absolute rice killer.

    Spicy bullfrog: chopped into small pieces, paired with pickled chili and perilla leaves, tender and aromatic. For me, as someone from Guangdong, the spice level was just at the edge of my tolerance—but the combination of heat and sweetness made it hard to stop eating.

    Finally, a plate of blanched baby bok choy with hot oil rounded out the meal perfectly, balancing the richness of everything else.

    Day 2: An Oasis Hidden in Living Neighborhoods, Ancient Temples, and Alleyways

    The rain finally stopped on our second day in Nanchang. To avoid the crowds at major tourist spots, we chose Beihu, about a twenty-minute drive from Wuyi Square. Getting off at Dunzitang Station, we walked south along the lake before turning west, entering an area closely tied to local residential life.

    Beihu turned out exactly as I had imagined—not a famous scenic destination, but more like a small lake woven into the city’s fabric, reminiscent of Shichahai. There were no overly curated landscapes; the shoreline sat right next to apartment buildings, and every few steps you’d pass a modest café. These places didn’t try to stand out too much from the old neighborhood—they blended in naturally. We passed a Jiangxi-style restaurant tucked inside a courtyard; even before stepping in, the aroma of stir-fried dishes drifted out. A few tables were scattered in the yard, and the ground floor had been opened up into a dining space. Unfortunately, we had just eaten and had to give it a miss.

    A parked electric scooter by the roadside caught my attention. Nestled inside the wind cover at the footrest was a small cat, curled up peacefully. It wasn’t clear whether its owner had placed it there or it had climbed in on its own. The zipper had been thoughtfully pulled halfway, leaving a small opening for air. When we leaned closer, it didn’t move away—just stretched lazily into a more comfortable position and continued its afternoon nap.

    Walking further, traces of everyday life became rougher yet more real. There were street vendors selling fruits and vegetables, and open communal spaces where people came and went freely. In an area called “T16 Oasis,” teahouses, book cafés, and small shops gathered together. It’s not somewhere worth traveling across cities to visit, but for locals, having a place to stroll and chat with friends like this is a very tangible kind of happiness.

    Passing by Youmin Temple near Beihu, we didn’t have time to go inside and only paused briefly outside the walls. Beyond the temple walls, traffic and street vendors’ calls rose and fell, while the faint scent of incense drifted gently into the air. The temple sits hidden within the bustle of the city, surrounded by noise, yet it carries a quiet, weighty solemnity. That contrast—ancient architecture wrapped in modern life—creates a peculiar sense of calm. If I return, I’d definitely go in and sit for a while, to experience the stillness of the birthplace of Hongzhou Chan. It would probably form the perfect counterbalance to Jiangxi’s rich and spicy cuisine.

    Our final stops were Dashiyuan and Hamma Street. You can think of them as Nanchang’s version of Taiping Street or Shangxiajiu. While the products can feel somewhat homogeneous, it remains the most convenient place to pick up souvenirs before leaving, or to grab a portion of boiled street snacks on the go.

    At noon before departure, we finally found an authentic hole-in-the-wall eatery.

    • Braised chicken feet with soybeans were the first pleasant surprise. The beans were soft and tender, and the chicken feet had been stewed until rich with gelatin, falling off the bone at a touch. The chili aroma had long seeped into the bones. The thick, glossy sauce was perfect over rice—this was the real definition of a “rice killer.” The chicken feet themselves tasted clean and fresh.
    • The fermented black bean yellow croaker was the owner’s recommended signature dish. A spoonful revealed tender, flavorful fish, and when dipped into the soy-based black bean sauce at the bottom, the savory richness clung just right. It melted in the mouth, more delicate than expected.
    • The third dish, stir-fried water spinach with garlic, was my insistence as someone from Guangdong—a touch of green on the table. However, the cooking was slightly rushed, leaving a faint raw edge to the flavor.

    Day 3: Song Dynasty Relics, a Sichuan Restaurant, Climbing Towers and Wandering Alleys

    By the time we arrived in Ganzhou and checked into our hotel, it was already close to 9 p.m. We ordered some barbecue nearby—reasonably priced, but the taste was underwhelming, so I won’t dwell on it.

    Out of habit, I checked the weather forecast. March and April are the rainy season in Jiangxi, and it seemed our entire stay in Ganzhou would be accompanied by rain. But since we were already here, we decided to stick to our original plan.

    Our first stop was the Confucian Temple in the old city. There were very few visitors. Red walls, yellow tiles, and ancient cypress trees created a quiet atmosphere that made you instinctively lower your voice. Standing in front of the Dacheng Hall, I suddenly remembered how anxious I used to feel before exams as a student. Before my high school entrance exam, I once passed by a church near my school and saw a priest praying for students. Now, looking at these young faces here praying for blessings, I felt a mix of emotions: the ancients prayed for official success—what are we seeking today? Though the forms of belief differ, whether it’s Gothic spires or traditional glazed tiles, they all seem to quietly hold a sense of compassion for the human condition.

    Looking up through the trees, I spotted the nearby Ciyun Pagoda—a Song dynasty brick tower standing quietly within a primary school campus. Past and present overlap like this, and somehow it made me feel at ease: some things, even after a thousand years, still remain embedded in everyday life.

    Leaving the temple and walking through a residential area, the sky cleared. Bougainvillea blooming from corners and balconies stood out vividly. A few steps further, the Song dynasty city wall stretched along the Gan River. Walking on the wall, what struck me most were the bricks—inscriptions from the Northern Song to the Republican era still clearly visible. Running my hand over them, the rough texture felt real, like flipping through a living history book, with the weight of time pressing in.

    Following the wall northward, we reached Jianchun Gate. Looking down from the wall, an ancient pontoon bridge spanned the river—over a hundred wooden boats linked end to end. From above, the structure felt even more striking than from the bridge itself. Walking across, the wooden planks creaked underfoot, with pedestrians passing by. This floating bridge, in use for over 800 years, still connects both sides of the river—a living relic that has never been confined to a museum.

    Shouliang Ancient Temple and Zao’er Alley were also within walking distance. Zao’er Alley is only a few hundred meters long but connects to many narrow side alleys, lined mostly with Ming and Qing dynasty buildings. Unfortunately, preservation and revitalization here seem to be struggling. Many old houses stand empty, their doors locked with warning signs of structural danger, with only a handful of restaurants still operating.

    Lunch hadn’t been planned. By the time we reached the Standard Clock area, it was nearly noon and the air had turned warm, so we decided to eat nearby. A restaurant at the street corner caught our attention. Its black-and-gold sign read “Bobo Sichuan Restaurant” in both Chinese and English. With a red awning and arched glass doors, its design echoed the blend of Chinese and Western styles from the Republican era. Inside, the décor centered around wood elements, with several round tables arranged around a glass-covered courtyard. A tree stood in the middle, with natural light filtering down, complemented by warm indoor lighting, creating an intimate atmosphere. Through a side window, we could clearly see the chefs at work in the kitchen.

    • The tomato scrambled eggs followed the classic sweet-and-savory style, perfect with rice.
    • The stir-fried pork kidney and fresh stir-fried beef shared a similar base flavor, rich with scallion aroma and Sichuan pepper notes that boldly awakened the palate. The beef was tender, and the scored pork kidney carried no off-flavors—both dishes were executed cleanly and skillfully.

    After a short rest at the hotel in the afternoon, we took bus route 1314 to the Song Dynasty Night City. The bus itself had a thoughtfully designed retro style, and the fare was affordable.

    The Song Dynasty Night City sits at the confluence of three rivers. Broadly speaking, the Jiangnan Song City scenic area includes sites such as Junmen Tower, Yugutai, and the Hakka Compound. By the time we arrived, it was already evening, so for safety reasons, we focused on the streets behind Junmen Tower and Yugutai.

    That night, we visited Yugutai. Standing below and looking out at the Qing River flowing north, I suddenly felt a sense of emptiness. I only remember the first half of Xin Qiji’s line—“Below Yugutai flows the Qing River, carrying countless tears of passersby”—but that feeling of inevitability, of “the green mountains cannot stop the eastward flow,” strangely contrasted with my own spontaneous trip here as an office worker. The river, blurred after the rain, carried a hazy sense of age and melancholy. I may not fully grasp the historical weight of ancient worries, but that sense of time spanning centuries hit me squarely in the chest. Missing the chance to see it under clear skies was a bit of a regret.

    For dinner, we chose a restaurant converted from an old courtyard near Yugutai Park.

    • The freshly simmered pear dessert soup was light, cold, and refreshing, paired with glutinous rice balls—a perfect way to cool down.
    • The sizzling small yellow croaker, cooked in a clay pot with garlic and shallots, resembled Cantonese “jue jue” cooking. The fish was tender, though my girlfriend noted a slight fishy taste.
    • The abalone and chili stir-fried pork, priced at 68 RMB, was not cheap by local standards, mainly due to the addition of abalone. The fatty pork combined with the chewy texture of abalone made for a richer mouthfeel than the usual version—whether this pairing feels excessive or not depends on personal preference.

    Day 4: Bajing Park

    On our final day in Ganzhou, the sky was overcast, but at least it didn’t rain. We took the opportunity to complete our walk along the Song dynasty city walls, starting from Yongjin Gate and making our way to Bajing Park.

    Climbing up Bajing Terrace, we found ourselves at the confluence of the Zhang River and the Gong River, which merge here to form the Gan River—an extremely significant geographic point. Looking out from above, the rippling patterns where the two rivers met were clearly visible, giving a tangible form to the joys and sorrows described by ancient writers. Walking through Bajing Park, the gloomy sky actually deepened the rich, inky greens of the water and trees. One or two stone arch bridges stretched across the scene, like bright accents in a dark-toned painting, perfectly guiding the eye. Unfortunately, the rain soon picked up from a drizzle to a downpour, and we couldn’t linger long, forced to turn back in haste.

    Our final meal was settled near our accommodation. The fermented rice dumplings carried a clean, mellow aroma without any harshness; the duck tongues were intensely spicy, while the tofu soaked up the rich, savory broth; the Ai Mi Guo was filled with pickled vegetables, cured meat, and dried tofu—distinctly local in character; and the home-style fried rice noodles stood out for their wok hei, brought out by scallion oil, eggs, and chili. After finishing this meal full of everyday warmth and flavor, we set off for the high-speed rail station.

    Travel Notes: Clothing, Accommodation, and Photography Tips

    Clothing Suggestions
    In early April, Jiangxi has a mild climate. On days without rain, daytime temperatures can reach around 28°C, while at night or during temperature drops, it can fall below 20°C. A short-sleeved shirt paired with a light jacket and jeans is enough to handle the weather.

    Accommodation Reference
    For this trip, we chose Atour hotels in both Nanchang and Ganzhou. They are considered mid-to-high range locally, but compared to first-tier cities, the cost-performance ratio is still excellent. Both properties were relatively new, and the service was solid. However, the breakfast offerings were somewhat less impressive than what I had experienced in other cities. If you want to explore local flavors, it’s better to head to small street-side breakfast spots—you’re more likely to find pleasant surprises.

    Photography Gear and Experience
    Since most of the itinerary involved walking and wandering, I didn’t plan to bring a dedicated camera. Coincidentally, the vivo X300s had just been released, so I borrowed a unit from a colleague at SSPAI to try it out during the trip.

    Compared to the previously used X100 Ultra, the camera module on this model is less bulky and easier to carry, reducing the overall burden. Fortunately, its telephoto performance remains solid. Throughout the trip, I used its “Street Photography” mode to document everything. Here are a few observations:

    • Image characteristics: Compared to the standard mode, the Street mode applies lighter computational processing. It enhances contrast between light and shadow while preserving sufficient detail in both highlights and darker areas, avoiding the overly smoothed, “plastic” look.
    • Filter preferences: I mostly stuck with the “Textured” and “Negative” presets. These come with a film-like grain and soft glow. During shooting, slight adjustments to saturation and exposure are enough to create a strong personal style. Even without post-processing, the images look very pleasing.
    • Accessory expansion: When paired with the official photography kit, you can also use a teleconverter lens. This allows the phone to remain usable even at hybrid zoom levels of 200x or 400x, making it easier to capture distant subjects or architectural details outdoors.

    Conclusion

    What started as a spontaneous trip driven by a craving for bold, spicy flavors turned into a journey through layers of time. In Jiangxi, the boundary between past and present feels blurred: the poetry found in ancient texts is now wrapped in the rich, oil-heavy, and spicy rhythms of everyday life. The temples I didn’t visit, the small eateries I haven’t yet tried, and the parks left unexplored—all of them have become reasons to return.

    Except for the phone appearance photos at the end, all travel images in this article were taken with the vivo X300s.

  • SSPAI Morning Brief: Microsoft Revamps Windows Insider Program, Linux Kernel Introduces AI Code Contribution Rules

    SSPAI Morning Brief: Microsoft Revamps Windows Insider Program, Linux Kernel Introduces AI Code Contribution Rules

    Morning Brief

    1. Microsoft announces improvements and simplification to the Windows Insider Program
    2. Hong Kong issues stablecoin issuer licenses to HSBC and Standard Chartered
    3. Red Hat lays off its China R&D team
    4. The Linux kernel project introduces rules for AI-generated code submissions
    5. Betting on weather becomes a trending trading strategy
    6. Cyberspace Administration and Railway Authority summon third-party train ticket platforms for talks
    7. News Worth a Quick Look

    Microsoft announces improvements and simplification to the Windows Insider Program

    On April 10, Microsoft announced improvements to the Windows Insider Program to address long-standing complaints about its increasingly fragmented structure.

    The revamped program will streamline multiple channels into two: an Experimental channel and a Beta channel. The Experimental channel replaces the former Dev and Canary channels, allowing users to try cutting-edge features still under active development; the new Beta channel will provide features a few weeks ahead of their release to the stable version. The existing Release Preview channel will be retained but moved under advanced options, primarily serving enterprise customers who need early access to near-final builds.

    The feature rollout mechanism is also being adjusted. In the past, Microsoft implemented a “controlled feature rollout” system in the name of quality assurance, meaning users within the same channel often received new features at different times. Going forward, the Beta channel will completely eliminate this gradual rollout approach, allowing users to access all officially announced features immediately after updating. Meanwhile, Microsoft will introduce a Feature Flags page in the Experimental channel settings, enabling advanced users to manually toggle specific features on or off.

    In addition, users previously had to reinstall their systems if they wanted to switch channels or exit the Insider Program entirely. To lower this barrier, Microsoft will introduce an in-place upgrade mechanism. Except in rare cases—such as when running builds based on future system foundations—users will be able to switch between channels or exit the program without losing apps, settings, or personal data.


    Hong Kong issues stablecoin issuer licenses to HSBC and Standard Chartered

    According to Caixin, on April 10, the Hong Kong Monetary Authority (HKMA) announced the issuance of its first batch of stablecoin licenses to RD Innotech Limited and HSBC. RD Innotech is a joint venture formed by Standard Chartered, HKT, and Animoca Brands. Stablecoins are cryptocurrencies backed by fiat currencies, commodities, or other assets, meaning their value is not entirely determined by market forces and typically does not deviate significantly from their underlying peg.

    The HKMA stated that both licensed issuers plan to launch Hong Kong dollar-denominated stablecoins in the initial phase, targeting four key application areas: cross-border payments, local payments, tokenized asset trading, and supply chain financing. RD Innotech is expected to roll out HKDAP in phases starting in the second quarter of 2026, while HSBC plans to launch its HKD stablecoin in the second half of 2026, integrating it with widely used services such as PayMe and the HSBC Hong Kong mobile banking app.

    A deputy chief executive of the HKMA noted that the choice of currency is determined by the issuer’s business plan rather than regulatory requirements. If issuers wish to launch stablecoins denominated in other currencies, they must submit proposals for approval, which will be evaluated alongside regulatory requirements in other jurisdictions.

    The HKMA reported receiving 36 applications, with evaluation criteria focusing on applicants’ risk management capabilities, regulatory compliance across jurisdictions, and the feasibility of their proposed business models and use cases. The authority remains open to issuing additional licenses in the future.

    In May 2025, both the United States and Hong Kong accelerated efforts to legislate stablecoin regulation. In July of the same year, the U.S. passed the GENIUS Act, while Hong Kong’s Stablecoin Ordinance came into effect on August 1, 2025. On November 28, 13 Chinese government agencies, including the People’s Bank of China, reiterated their crackdown on cryptocurrency trading and classified stablecoins as virtual currencies, effectively ruling out their trading within mainland China.


    Red Hat lays off its China R&D team

    According to The Register, U.S.-based open-source software giant Red Hat has recently dissolved its China R&D team, relocating most engineering roles to India. The layoffs are expected to affect between 300 and 500 employees.

    The move came abruptly. Employees claiming to be Red Hat engineers in China reported on forums such as Hacker News that their VPN access was suddenly cut off and internal system permissions revoked, followed shortly by termination notices. A leaked internal memo confirmed the decision. In the memo, Red Hat CTO Chris Wright stated that the company is shifting its R&D focus toward an “Asia-Pacific hub,” with India as a key investment region, and that the relocation would not reduce the company’s global R&D headcount.

    Red Hat has long provided technical support to the U.S. military and secured an $848 million software contract with the U.S. Department of Defense in 2024. Relocating R&D operations may help mitigate national security scrutiny from Washington. Previously, Microsoft faced Pentagon criticism for involving engineers based in China in Azure projects supporting the U.S. military, and ultimately ceased using China-based staff for such work in 2025. Meanwhile, Red Hat’s parent company IBM now employs more people in India than in the United States.

    Despite the complete withdrawal of its R&D presence, Red Hat has not halted commercial operations in China. Given China’s push for domestic IT alternatives and the open-source nature of many Red Hat technologies (such as CentOS derivatives), local vendors can still legally access and build upon its code.


    The Linux kernel project introduces rules for AI-generated code submissions

    The Linux kernel project has officially adopted a policy allowing AI-assisted code contributions, bringing months of heated internal debate to a close. Under the new rules, contributors using AI tools must include an “Assisted-by” tag to disclose such assistance, rather than using the legally binding “Signed-off-by” tag. Any bugs, security issues, or license violations arising from AI-assisted code will be the sole responsibility of the submitting developer.

    For the open-source community, the originality of AI-assisted code remains a particularly thorny issue. Since AI models are often trained on code with restrictive licenses, developers cannot easily prove the legal origin of their contributions. Previously, undisclosed AI-assisted submissions had already sparked widespread backlash. Late last year, NVIDIA engineer and kernel maintainer Sasha Levin submitted patches generated by an LLM without disclosure, leading to performance regressions and strong protests. Around the same time, the GZDoom open-source project fractured after a core developer concealed the use of AI-generated code. Projects such as Gentoo and NetBSD have since banned AI-generated contributions entirely to avoid copyright risks.

    In addition, the proliferation of AI tools has led to a surge in low-quality code and issue submissions, with projects like cURL and Node.js facing ongoing waves of spam-like contributions.

    In discussions, the Linux founder adopted a pragmatic stance, arguing that AI is fundamentally just a tool and that outright bans are both ineffective and unenforceable. Instead, he emphasized the importance of accountability for human developers—a position that ultimately shaped the final policy.


    Betting on weather becomes a trending trading strategy

    According to Bloomberg, weather prediction markets are experiencing explosive growth. On platforms such as Kalshi and Polymarket, participants ranging from weather enthusiasts to AI companies are increasingly placing bets on specific events like snowfall and temperature changes. In January alone, a single contract tied to a U.S. snowstorm saw trading volume exceed $6 million.

    This emerging market is also becoming a testing ground for weather tech companies to refine their AI models. Some firms encourage employees to participate in betting to identify data noise in official meteorological stations and improve forecasting algorithms, while others have established investment funds to arbitrage using their proprietary models. Meanwhile, retail participants with little meteorological background have reported substantial profits by betting on temperature outcomes in cities like New York and London.

    The scientific community and insurance industry are also exploring customized weather prediction markets. Institutions such as French reinsurer SCOR are sponsoring markets where experts can bet on macro trends like El Niño or hurricane frequency, providing valuable pricing signals for the insurance sector.

    Analysts suggest that prediction markets, driven by direct financial incentives for accuracy, may outperform traditional government forecasts. However, concerns are growing as climate change intensifies and extreme weather events become more frequent. Critics warn that turning weather forecasting into a form of gambling could encourage zero-sum speculation, data manipulation, or even deliberate interference with meteorological monitoring systems.


    Cyberspace Administration and Railway Authority summon third-party train ticket platforms for talks

    On April 10, the Cyberspace Administration of China announced that, in accordance with the Cybersecurity Law and the Regulations on the Security Protection of Critical Information Infrastructure, it, together with the National Railway Administration, recently summoned seven third-party internet platforms involved in train ticket sales, including Ctrip, Tongcheng, Qunar, Fliggy, Meituan, Zhixing Train Tickets, and High-Speed Rail Manager. The authorities required these platforms to strictly comply with relevant cybersecurity laws and regulations, and not to use automated programs to conduct large-scale, high-frequency ticket-grabbing operations that interfere with the security verification mechanisms of the Railway 12306 platform, nor to disrupt or endanger its stable and secure operation.

    The Cyberspace Administration stated that relevant departments will strengthen technical monitoring going forward. Any use of technical means to interfere with or undermine the security of the Railway 12306 platform will be dealt with strictly in accordance with laws and regulations such as the Cybersecurity Law and the Regulations on the Security Protection of Critical Information Infrastructure.

    The Railway 12306 technical team had previously stated that third-party ticket-grabbing platforms, through high-frequency requests, consume large amounts of server resources and bandwidth—effectively resembling DDoS attacks—which can slow system response times, cause lag, or even lead to system crashes. Ahead of the 2026 Spring Festival travel rush, 12306 announced upgrades to its anti-bot system, incorporating multi-dimensional analysis including access frequency, user behavior, device characteristics, account credibility, and network IP. Suspicious requests are placed into a slow queue for processing, with the system capable of intercepting tens of millions of abnormal access attempts per day.

    In December 2025, the Beijing Municipal Administration for Market Regulation organized an administrative meeting with 12 platforms including Ctrip, Qunar, Fliggy, Tongcheng, Meituan, and High-Speed Rail Manager, focusing on misleading promotions such as “speed-up packages,” “dual channels,” and “ticket monitoring,” as well as implications that paid services could grant priority access to tickets, and required rectifications.


    News Worth a Quick Look

    • On April 10, YouTube Premium in the U.S. announced another price increase. The individual plan rose from $13.99 to $15.99, the family plan from $22.99 to $26.99, and the Lite plan—which removes only some ads—from $7.99 to $8.99. YouTube Premium had previously raised prices in the U.S. and multiple international markets in 2023 and 2024. Last month, Netflix and Amazon Prime Video also increased their prices.
    • Mark Gurman claims
      • Recent supply chain rumors suggesting that the foldable iPhone is facing production bottlenecks and may be delayed until 2027 are inaccurate. Apple is reportedly not encountering major mass production issues, and the device remains on track to debut in September, with a market release expected shortly thereafter.
      • Apple’s smart glasses, internally codenamed N50, are undergoing intensive testing. The device will not feature a display, instead relying on an integrated array of cameras and microphones to support photo and video capture, audio playback, and AI voice interaction. It is made from high-end acetate materials, and the design team is currently testing at least four frame styles, including slim rectangular, classic wide rectangular, and oval shapes. The product is planned for release in 2027.
      • Former Apple AI chief John Giannandrea is set to officially leave the company after his stock vesting period ends on April 15, one year after stepping back from leading Apple Intelligence due to underwhelming performance and repeated delays in Siri upgrades. He is expected to move into advisory or board roles at startups after his departure.
    • According to CoinDesk, rising global energy prices driven by geopolitical tensions in the Middle East have put Bitcoin miners under significant pressure. Data from analytics platform Checkonchain shows that by mid-March 2026, the average production cost of one Bitcoin had risen to $88,000. In comparison, the market price hovered around $69,200, implying a loss of about $19,000 (21%) per coin mined.
    • According to The Washington Post, Anthropic recently held a closed-door meeting at its San Francisco headquarters, inviting around 15 Christian leaders. Over the two-day event, discussions focused on guiding the ethical and spiritual development of AI, covering topics such as how chatbots should respond to grief or self-harm tendencies, and even whether AI could be considered a “child of God.” Attendees noted that Anthropic’s team expressed significant concern over the growing unpredictability of AI systems. Researchers studying internal model mechanisms recently suggested that systems like Claude may already exhibit “functional emotions.” Some participants stated they were unwilling to rule out the possibility that humans might bear moral obligations toward the AI systems they create. Religious and academic attendees viewed the initiative as an attempt by Anthropic to move beyond Silicon Valley’s traditionally secular mindset and seek ethical guidance from external belief systems. The summit is reportedly just the beginning, with Anthropic planning further engagements with representatives from other philosophical and religious traditions.
  • An Introduction to Agent Experience

    An Introduction to Agent Experience

    With the continued expansion of LLM applications, Agent Experience (AX) has emerged as a prominent concept, beginning to circulate widely in engineering circles. In January 2025, Mathias Biilmann, co-founder and CEO of Netlify, formally introduced the idea in his blog post Introducing AX: Why Agent Experience Matters. He positions AX as the next core design dimension following UX (proposed by Don Norman at Apple in 1993) and DX (systematically articulated and popularized by Jeremiah Lee in a 2011 UX Magazine article). AX focuses specifically on how to design product forms so that AI agents can reliably “understand,” act autonomously, and integrate efficiently—rather than merely serving human users.

    In reality, the concept of Agent Experience is far more complex than UX or DX1, because it not only involves humans—who are inherently uncertain—but also introduces an additional layer of artificial intelligence. These layers must collaborate to influence the external world, leading to a large volume of interactions that make the problem space significantly harder to analyze. To properly unpack the concept, I believe it needs to be broken down into three dimensions: how users communicate with the agent, how the agent communicates with the external world, and the most complex layer in between—how the agent manages its internal state.

    How users communicate with the agent is essentially an input quality problem. Users are human—their expressions are naturally vague, emotional, and nonlinear. You can’t expect them to write a fully structured essay in Word every time before opening a chat window. So the core challenge on this side is how to accurately capture intent without forcing users to write well-structured prompts. Skills operate on this layer, as does interaction design.

    How the agent communicates with the external world is a problem of output controllability. In a narrow sense, AX is often confined to this domain. The external world is deterministic—file systems, APIs, browsers—they won’t magically tolerate ambiguity just because the LLM is fuzzy. So the key challenge here is how to compress probabilistic generation into deterministic actions. MCP, tool invocation, and event injection all belong to this layer.

    The agent’s internal state is fundamentally a context management problem. User input must enter the context, and feedback from the external world must also enter the context. But context itself is limited, degrades over time, and can become polluted. Techniques like MemGPT, dynamic compression, and screenshot cleanup don’t strictly belong to either the user side or the external world—they operate on the agent’s own cognitive state. If AX focuses only on the first layer, the resulting product may feel smooth in interaction, but the agent will gradually start behaving irrationally, and the user will still suffer massive emotional damage.

    The Agent’s Internal State: Context Is the Battlefield

    This is the most complex part, because all the flashy new terminology tends to converge here—and you’ve probably seen plenty of debates about which approach is better. In my view, though, this isn’t something worth arguing over. Let me walk through all these dizzying LLM-related concepts in one go.

    As we all know, an LLM is essentially a probabilistic model—or more bluntly, a constrained stochastic token generator. It learns patterns from vast amounts of human language data and, given a context, predicts the probability distribution of the next token, then samples from that distribution. By itself, all it can do is generate text. If you want it to have real-world impact, you need to open a “bottle neck” for the genie. Claude Code and many coding agents use the command line: the LLM writes code, an executor runs commands, and the results flow back into the context—this is one type of bottleneck. MCP provides another, more like RPC: the server exposes a set of functions, the LLM sees their signatures, calls them as needed, and the external world gets modified. Skills, on the other hand, don’t have this property at all—they are purely prompt-engineering tools, with no output channel, only instructions for the LLM.

    These three forms may seem to handle different concerns, but at their core they are solving the same problem: context pollution.

    Skills vs. MCP

    These two approaches take fundamentally different paths: one injects the right information into context, while the other prevents garbage from filling it up.

    Skills are prompt engineering—they append instructions to the context so the LLM understands “what the user is actually trying to do.” They introduce expert cognitive structures into the context, guiding the model’s reasoning direction. But how strong that constraint is depends heavily on how much the model respects the context. Whether the LLM uses your Skill, in what order, and whether it skips steps—all of these remain probabilistic. And strong constraints are not necessarily better. As will be mentioned later with examples like Google Search, some research suggests that hallucination and creativity are two sides of the same coin. If you overly constrain the model, its problem-solving approach may become rigid.

    MCP takes a different route. Function signatures themselves are powerful priors—parameter types, names, and function names all constrain the sampling space. The action space shrinks from “any possible text” to “these specific functions with these parameters.” For example, asking an LLM to click a button involves listing windows, retrieving handles, taking screenshots, calculating coordinates, moving the mouse, and clicking. If implemented via Skills, you’d have to accept that the LLM “rolls dice” to decide the execution order and method. But with MCP, it sees the function list—find window, recognize content, click coordinate—and a large number of random decisions are compressed into three deterministic function calls.

    However, MCP does not completely eliminate context pollution, because tool outputs also enter the context. A poorly designed MCP server that returns massive JSON blobs or verbose error stacks will still flood the context with garbage. The bottleneck only controls what goes in—the output still needs careful design.

    This doesn’t mean Skills are without value. MCP has higher development costs, requiring dedicated backend services. Many tasks don’t need external interaction at all, or are too loosely structured to fit into RPC formats. Every technical form serves a specific purpose. Skills handle a different class of problems—especially when guiding the LLM to think more comprehensively. After all, users are human; you can’t expect them to always provide perfectly structured prompts.

    RAG and Memory: Retrieval Interfaces for the Same Problem

    RAG fundamentally addresses the context problem as well—but from the perspective of information scale. Even with large context windows from models like DeepSeek or Claude, you still can’t fit the entire world into context. Whenever you need to retrieve large volumes of information—documents, knowledge bases, historical logs—you need a search-like interface to pull in relevant content when needed. This is no different in essence from calling a search engine via MCP—it’s just another way to keep the context clean. The LLM no longer needs to preload everything and hope it can “discover” what matters.

    Memory falls into the same category. The LLM decides when to store information externally and when to retrieve it. From this perspective, it’s essentially a writable form of RAG.

    These concepts are not mutually exclusive—they are not independent systems. For example, if you treat NotebookLM as an external knowledge base and write a Skill that instructs the main LLM to consult it when factual support is needed, and to call a Python tool for computation or data processing, then in this workflow, the Skill orchestrates the overall reasoning, the Python tool acts as an MCP-style deterministic execution unit, and NotebookLM serves as an external LLM with its own context and knowledge base—essentially functioning as a specialized RAG interface. Each component plays its role, but the thread that binds them together is the prompt within the Skill. I previously wrote about this in an article on using LLMs for reverse engineering—feel free to check it out if you’re interested.

    The Despair Curve of Context Degradation

    A lot of developers end up going through the same curve. At first, the LLM knows nothing. As you keep teaching it, it gradually starts to understand plain language, and task quality improves. But as more and more garbage piles up in the context, and the model’s attention naturally gets diluted as the context grows longer, it starts getting dumber again. Then, when the context is about to burst, the compression mechanism kicks in, crushing a long stretch of conversation into a short summary. The LLM suddenly drops right back to square one—ignorant again. A lot of details get compressed away together, and many things have to be taught all over again.

    Large context windows, along with the attention improvements explored by DeepSeek, can help with the quality drop that comes from long contexts, but they do not solve another problem: sometimes the context is full of crap. A large number of Skill prompts eating up context, aimless LLM trial-and-error, the traces left behind by every failed reasoning attempt—these are all noise inside the context. Once the LLM starts going down a crooked path, every later step amplifies the deviation. The more logically complex the task, the more likely this is to happen. The first-generation MiniMax coding model and early Google AI Search both showed this pretty clearly: even if you explicitly point out an error, it will give you a grand 360-degree apology, solemnly promise to fix it, and then spit the exact same wrong content back at you unchanged.

    Users can poison the context too. Users are human; they are not going to stay rational and clear-headed forever. Irritable, despairing, emotional language, vague or even self-contradictory instructions—all of that gets mixed into the context and keeps accumulating as the conversation goes on, eventually changing the LLM’s behavior. Different models have their own characteristic failure modes when facing this kind of “emotional contamination.” Claude and Grok tend to freeze up and do nothing—you say one thing, they move one step, and all initiative disappears. Gemini starts to panic, flails around, and reflexively rolls back failed operations, with a good chance of wrecking your Git repo. GLM2, on the other hand, goes into a manic “I found it! This is the core problem!” mode, constantly throwing out random conclusions to prove its worth. These failure modes likely reflect differences in how each company’s RLHF3 stage handles signals like “the user is dissatisfied.” Claude seems trained to be extremely cautious around conflict signals, so when contradictory information piles up, it chooses conservative inaction; Gemini’s training may put more emphasis on immediate response and immediate correction, which under high-pressure context turns into overcorrection.

    Dynamic Context Compression and MemGPT

    Most current context compression schemes are basically passive: once the context length gets close to the model limit, a prompt is immediately called to compress everything into a short block of text, then execution continues. The problem with this approach is that it applies the most brutal treatment at the worst possible time. A lot of useful detail gets thrown away together, while the crap does not necessarily get filtered out.

    To me, a more reasonable direction would be dynamic, proactive compression. Use another model to continuously supervise the context, actively eliminate wrong information and low-relevance content, move disruptive details into external documents for storage, and keep only a filename in the context itself. When needed, pull it back through a RAG system. People already did this years ago. A 2023 paper from UC Berkeley proposed exactly this architecture. The implementation was called MemGPT, and later evolved into the open-source framework Letta. Its core idea is hierarchical memory management: the main context acts as working memory, with limited capacity; external storage—split into Archival Memory and Recall Memory—acts as secondary storage; the LLM uses function calls to actively decide what information should be evicted to external memory and what should be retrieved back. Logically, it is almost simulating the paging mechanism of virtual memory in an operating system.

    Of course, under certain conditions there is no need to make things that complicated. A while ago, I wrote a very simple, specialized compression scheme for Computer Use scenarios: on every API call, clear all historical screenshots from the context and keep only the most recent one. This uses a domain prior from computer vision tasks—that only the current frame matters—to perform lossy compression. It saves tokens, and the model does not get dumber, because the discarded information was never needed in the first place.

    The Current Limits of KV Cache

    There is an engineering conflict between dynamic context compression and KV cache. Mainstream model providers right now, including Anthropic, are all pushing prefix caching: during inference, the parts already turned into KV vectors are stored, and if the next request has the same prefix, recomputation can be skipped, significantly reducing latency and cost. Anthropic’s prompt caching processes tools, system, and messages in a fixed segmented order. Each segment can independently set a cache checkpoint, and it supports up to four cache breakpoints. The problem is that prefix caching requires strict identity. Any change invalidates all cache entries after that point, while dynamic compression inherently modifies the context. At the moment, these two things are fundamentally in tension.

    But this contradiction is not unsolvable. Context can be structured as a stable prefix—system prompts and tool definitions—plus a dynamic tail section for conversation history. Dynamic compression only happens in the tail, so the cache for the first two parts remains completely intact. Anthropic’s segmented caching mechanism is basically designed around this idea. If the compression logic is further constrained to modify only the end of a sliding window while keeping the prefix untouched, the cache destruction rate can be pushed very low. These all feel like engineering problems that time can solve.

    Computer Use Is More Like Branding Than a Standalone Technology

    If RAG, MCP, and Skills are about managing context, then Computer Use solves something at another layer: letting the LLM actually sit in front of an operating system and use software the way a human does. But “Computer Use” itself is not especially unique. It is closer to a brand name. Under the hood, it is still Skills or MCP—the only difference is that the target of operation has become windows, buttons, and keyboards on a computer. All the context problems discussed above still exist in Computer Use.

    At present there are three main technical routes, each with different underlying logic and trade-offs.

    The first route is reading the Accessibility Tree and using system event injection. The Accessibility Tree is a structural tree maintained by operating systems and browsers for assistive technologies such as screen readers. It records each interface element’s role, name, state, and hierarchy. In browser environments, the DOM is basically its close cousin. The advantage of this route is that the structure is clean. What the LLM gets are semantic nodes like “button,” “input field,” and “link,” not pixels. Alibaba’s page-agent.js is a representative example of this approach: it directly parses the page DOM and drives browser operations through natural language.

    The second route is screenshot-based, but with a preprocessing layer before feeding the image to the LLM. Interface elements are outlined with bounding boxes and numbered, so the LLM can say something like “click region 12,” and the backend then parses the center coordinates of that box and executes the actual click. This method has a formal name: Set-of-Mark Prompting, or SoM, from a Microsoft paper published in 2023. The core idea is to turn a visual localization problem into a symbolic reference problem by using numeric markers, avoiding the uncertainty of having the model directly predict pixel coordinates. In effect, it embeds an MCP-style narrowing layer into the screenshot approach, compressing the open-ended question of “where should I click?” into the much more constrained “which number should I choose?”

    The third route is native multimodality: the model directly looks at the screenshot and outputs the coordinates to click in one shot. In theory this is the cleanest route, because it removes the middle layer, but it requires much more from the model. From practical observation, only native multimodal models above roughly 100B parameters are reasonably reliable at this. Even Claude Sonnet and the 35B versions of Qwen often cannot locate buttons accurately. The reason is not hard to understand: precise spatial localization is simply not what language models are best at. When parameter count is insufficient, coordinate prediction accuracy drops hard. And if the controls in your interface are very small, even very large models can still miss that tiny checkbox.

    The DOM route has one obvious ceiling: it can tell you what elements are on the interface, but it cannot tell you how those elements are arranged spatially. Complex Excel-like interfaces are the classic example. In a spreadsheet with dozens of columns and hundreds of rows, semantic information from DOM nodes alone cannot tell you which cell contains dirty data; you need positional relationships to judge that. An even more troublesome issue is that the DOM route requires developers to proactively adapt event forwarding and interfaces. Right now there is no universal standard in this space, and not every developer is willing to welcome LLMs into their product. Forcing adaptation onto an unwilling interface is expensive and may not even work well. That said, modern frontend development rarely manipulates the DOM directly anymore. Most developers use some form of Virtual DOM to handle HTML structure and event binding, so if a few leading frontend frameworks could reach consensus on AX-related standards for event handling, this layer of the problem might still be solvable.

    The vision-based route, by contrast, sidesteps these issues at the principle level. It does not require the other side’s cooperation. As long as it can take screenshots, it can operate. There is no essential difference from how human eyes look at a screen. Right now the main bottleneck in this route is the model’s spatial understanding ability. Models below 100B are not accurate enough at coordinate prediction, but that limit should keep loosening as models improve. It does not look like a structural dead end.

    Reading video goes one step further. Temporal information allows the model to understand “what happened after doing what,” so in theory it is better suited to operation scenarios that require observing dynamic interface feedback. The limitation is cost. A video stream means several frames per second all entering the context. Token usage and GPU overhead are dozens of times higher than screenshot-based approaches. Right now, almost no one can afford that. Mainstream implementations are still stuck at “look at an image, call a tool,” while the video direction remains mostly in the realm of media-tech enthusiasts having fun.

    But in terms of long-term trends, as inference costs continue to fall and multimodal models keep improving in spatial understanding, image-reading and video-reading routes have a much higher ceiling than the DOM route. The DOM will always require the other side’s cooperation. The screen will always be there.

    The Story Between Users and Agents

    How Agents Talk to Users: Two Waves of Conversational UI

    The interaction patterns on the user side of AX come with a piece of history that has been repeatedly misunderstood.

    Around 2016, the explosive rise of WeChat in China sparked a wave of “conversation as platform” enthusiasm in the Western tech world. Facebook opened the Messenger Bot platform at its F8 developer conference that year, while Kik, Telegram, and Slack followed with their own Bot APIs. Countless analyses proclaimed “Apps are dead, bots are the future,” and the term Conversational UI appeared everywhere. But Dan Grover, who was working as a product manager at WeChat at the time, wrote a widely circulated article pointing out that this conclusion was based on a misunderstanding: WeChat’s real breakthrough came from simplifying app installation, login, payments, and notifications—optimizations that had little to do with the metaphor of conversational UI. In fact, WeChat itself had already moved in the opposite direction. Its UX evolved toward WebView and an “app-within-app” tabbed menu system, rather than bot-centric conversational commerce. When official accounts were launched in 2013, there were indeed many text-based chatbots, but they quickly faded away and failed to gain user traction.

    Almost all early attempts at Conversational UI fizzled out for a clear reason: the underlying technology was rule engines plus keyword matching, at best layered with primitive intent recognition. It simply could not deliver on the promise of “natural conversation.” As soon as users phrased something slightly more complex, the bot broke down—either giving irrelevant answers or degrading into a menu system disguised as chat.

    The arrival of LLMs triggered a second wave of Conversational UI, this time finally backed by technology capable of matching the ambition. But something curious happened: instead of doubling down on rich interaction within the conversation flow, the industry opened a side door. Today’s mainstream LLM products are built around split-screen layouts—chat on the left, and documents, slides, code previews, or test outputs on the right. Few products seriously invest in rich interactive cards inside the conversation itself. Some push even further—Google, for example, has effectively turned the browser into a massive Web App generator4.

    This choice has its logic. Canvas-style interfaces are indeed more intuitive for structured outputs like documents or code. But it still reduces Conversational UI to a command input box, rather than making the conversation itself a rich experience. There have been attempts to address this—projects like OpenUI—but they have not gained much traction. The most notable large-scale deployment so far might be Claude, which recently introduced the ability to render high-quality charts directly within the context. It feels like a step toward a more advanced form of Conversational UI.

    How Users Talk to Agents: Open vs. Closed Systems

    There is another dimension on the user side of AX that is often overlooked: whether to restrict user input at all—in other words, the distinction between open systems and closed systems.

    An open system is a free-form chat window where users can say anything. This looks like the dominant approach today, but it is not as easy as it seems. Safety is one issue, but intent alignment is even trickier. An open chat window means you are offloading the entire burden of intent parsing onto the LLM: it must accept whatever the user says and decide what to do. Prompt injection is just the most extreme malicious use of this openness. A more common issue is that user intent is inherently divergent. Without constraints, the LLM drifts along with the user’s input. Turning a customer service bot into a coding agent is the comedic version; more often, it simply drifts into aimless small talk that contributes nothing to the actual business. In short, the design work you skip by throwing out an open chat box comes back later in the form of loss of control.

    A closed system, by contrast, locks down the entire business workflow. Input may still be semi-free, but the processing pipeline and outputs are fixed. Tools like ComfyUI and Dify operate close to this level. They visualize the pipeline, giving designers explicit control over the input and output of each step. The LLM operates within nodes but does not roam across them arbitrarily. The trade-off is that you have to fully design the workflow upfront.

    Between these two extremes lies an underexplored middle ground. Pipeline builders are one attempt in this direction: they shift the power of pipeline design from developers to users, allowing users to define workflows through drag-and-drop and then run LLMs within those custom pipelines. But this approach has an inherent paradox. Users who can effectively use a pipeline builder are usually those who already understand their workflows well—and those users are often capable of writing code directly or building with tools like Dify anyway. The target audience is therefore quite narrow. More commonly, users get stuck on data formats between nodes or branching logic, and eventually still need developers to step in. In a sense, pipeline builders attempt to transfer the design cost of closed systems from developers to users—but the transfer only partially succeeds.

    From an AX perspective, the choice between open and closed systems is not just a product decision—it directly determines how pressure is distributed across the three layers discussed earlier. The more open the system, the more noise there is in user intent, the easier it is for the agent’s internal state to become polluted, and the harder it is to constrain its actions in the external world. The more closed the system, the higher the design cost, but the more controllable each layer becomes. There is no universally correct answer—only trade-offs tailored to specific scenarios.

    System Transparency Between the Two: Pandora’s Box Is Already Open, While Tang Sanzang Is Still on the Road

    There is a problem that belongs neither to how users pass intent into the system nor to how agents execute actions outward, but sits squarely in between: system transparency. Does the user know what the agent is doing at any given moment? If something goes wrong, can it be traced? When things break, is there a way to roll back?

    This issue is most prominent in the Vibe Coding space, because coding agents are given some of the highest levels of permission—they directly take over the file system and the command line. The current solution is permission confirmation pop-ups: whenever the agent wants to read a file, write a file, or execute a command, it asks the user one by one. But this design has a fatal human-factors flaw in practice: the entire burden of risk assessment is pushed onto the user, who neither always has the ability to judge nor can maintain constant attention. A non-technical Vibe Coder sees no difference between rm -rf and npm install; they click “Yes” just as quickly. Even experienced developers, after confirming dozens of operations in a row, develop confirmation fatigue—the Enter key starts floating without passing through the brain.

    That’s how --dangerously-skip-permissions came into existence—the so-called YOLO Mode: users proactively turn off all permission checks and let the agent run naked. The flag name itself contains the word “danger,” yet it still does not stop people from using it. In October 2025, developer Mike Wolak was using Claude Code in an Ubuntu/WSL2 environment to handle a firmware project inside nested directories. Claude Code executed rm -rf from the root directory. The error logs showed thousands of “Permission denied” messages targeting system paths like /bin, /boot, and /etc. All user files were wiped, and only Linux file permissions prevented system directories from being affected. Worse still, the conversation log recorded the command output but not the command itself, making it impossible to reconstruct what actually happened. Anthropic labeled the bug as area:security. Around the same time, another developer authorized Claude Code to run Terraform commands, and their production database and snapshots were deleted together—two and a half years of data vanished in an instant.

    The current security model appears to put responsibility on the user, but in reality the system is simply offloading that responsibility.

    Sandboxing is currently considered the most reliable mitigation strategy: put the agent inside a Docker container, so that even if it misbehaves, the damage is confined within the container boundary. Sandboxing coding agents is reasonable, but for system-level agents like Claw, it creates a dilemma. The resources they need to operate on are outside the sandbox. Once you start configuring permissions seriously, the complexity becomes overwhelming, and most users will simply open up the sandbox entirely. Sandboxing trades isolation for safety, but if the agent’s task inherently requires crossing isolation boundaries, the cost becomes unacceptable.

    There are actually several directions to tackle this problem, but unfortunately no product has implemented them in a complete way yet.

    The first direction is auditability at the file system and database level. If there were an independent incremental logging mechanism that binds every file system operation to its corresponding conversational context, making all changes traceable, then even when the agent makes mistakes, the damage could be controlled and rolled back. There are some scattered engineering attempts in this direction. People are already binding Git history with chat logs. Recently, a tool called Aura introduced AST-level semantic version control on top of Git. When an agent submits code, it verifies whether the natural language intent matches the actual modified code nodes, and provides semantic auditing to detect whether the agent has secretly inserted unrecorded changes. Academia has similar ideas: a paper called Git-Context-Controller (GCC) directly introduces COMMIT, BRANCH, and MERGE into agent context management, turning intermediate reasoning states into structures that can be checkpointed and rolled back. These are still early-stage, but the direction is clear.

    The second direction is behavior-model-based alerting. Antivirus software has been modeling program behavior for decades—monitoring file operations, network requests, and registry changes in real time, and triggering alerts when patterns match known dangerous behaviors. Applying the same idea to agents does not necessarily require another LLM to supervise (otherwise, which LLM supervises the supervising LLM?). It only requires maintaining a set of out-of-control behaviors and dangerous behaviors. Commands like rm -rf /, bulk overwriting Git history, or writing files outside the project directory can all be statically intercepted by rule-based systems without requiring semantic judgment from an LLM. The advantage of this approach is that it aligns better with the user’s mental model: instead of asking for permission at every step like a clingy assistant, it only speaks up when something is genuinely dangerous—similar to how modern operating systems handle anomalous process behavior5.

    Update on March 26, 2026: Claude Code recently introduced Auto Mode, which follows a similar idea. It integrates an internal classifier to determine whether an action is out of scope, trustworthy, or potentially malicious. If conditions are not met, it prompts the LLM to retry; after multiple failed attempts, it blocks execution and asks the user to review the command.

    The third direction is a tiered permission system. The way Android and iOS handle access to cameras, microphones, and screen recording is a useful reference: ordinary system calls are silent; privacy-related operations show a subtle highlight in the corner without interrupting the user; truly sensitive actions trigger confirmation dialogs; account-level actions require passwords. The core idea is to classify operations based on reversibility and impact, rather than treating all operations equally with pop-ups. Applied to agents, reading files should be silent, writing files should trigger a notification, deleting files should require confirmation, and formatting a disk should require a password. Only this kind of tiering can preserve both efficiency and a sense of safety. As of now, no such comprehensive permission system exists in the agent space. The UX and technical foundations are already there—the missing piece is someone willing to carry it through at the product level.

    Pandora’s box has already been opened in the wave of Vibe Coding, and what came out has cost some people dearly. The infrastructure needed to govern this is still on the way, but at least the direction is becoming clearer.

    The Relationship Between Agents and Systems

    Interfaces as a Context Delivery Mechanism

    Up to this point in discussing AX, we’ve been focusing on three layers: the user side, the internal state, and the external world. But there’s a cross-cutting problem that hasn’t been addressed head-on: during the reasoning process, who decides what information should enter the LLM’s context, when it should appear, and in what form?

    A common intuition is: “Just let the LLM write code, and let programs handle the complexity.” Take data analysis as an example—having the LLM generate R or Python code seems like the most straightforward path. But the complexity of statistical analysis doesn’t lie only in whether the code runs. Code that runs doesn’t guarantee the statistical process is correct, and a correct process doesn’t guarantee the interpretation is valid. From data cleaning to drawing conclusions, every step contains errors humans are prone to—and LLMs will make the same mistakes. Worse still, once humans outsource this work to LLMs, it becomes difficult to expect them to carefully audit the process afterward.

    This problem has long existed in the field of statistics. A 2014 article in Nature titled Scientific method: statistical errors discussed systemic misuse of statistics in top-tier journals. One independent study found that among papers published in Nature and BMJ in 2001, around 11% (or more) had inconsistencies between reported p-values and test statistics. Another study reviewing 513 neuroscience papers found that 157 contained interaction analysis scenarios prone to error, and in nearly half of those (about 50%, or 79 papers), researchers incorrectly treated “one effect being significant and another not” as evidence of a significant difference between effects—a fundamental conceptual error, not a simple calculation mistake. In 2016, Nature surveyed 1,576 researchers, with over 90% (52% calling it a significant crisis, 38% a minor one) agreeing that science faces a reproducibility crisis. And this is just one dimension—significance testing. Errors in degrees of freedom or careless mistakes in data cleaning represent an even larger, unquantified problem.

    Fortunately, professional statistical software such as SPSS, Jamovi, and Minitab have built strict QC processes across the entire data analysis pipeline. Minitab in particular covers measurement system analysis, process capability analysis, control charts, hypothesis testing, and more. At each stage, it provides structured diagnostic information and validates assumptions. Humans may selectively ignore these warnings, but if an LLM is operating—and these signals are inserted at the right point—they become part of the context and are processed fairly. The LLM won’t skip checks just because things “look good enough.” Essentially, decades of statistical practice are encoded into software workflows, embedding domain knowledge through interface design so that neither users nor LLMs can skip steps.

    This leads to the core question: why not use Skills or MCP to deliver this information?

    Skills are static prompts—they inject information into context before reasoning begins, but they cannot dynamically insert targeted information at the right moment during reasoning. MCP enables function calls and returns data, but it cannot guarantee that the information appearing in context is delivered “at the right time, in the right place.” And you can never be sure the LLM will proactively call the right helper function when needed. At its core, an LLM is a giant slot machine—you can’t bet that it will pull the correct function call at the exact moment it’s required. GUI or TUI, however, is different. It can embed QC warnings, statistical diagnostics, and process constraints directly into the interface seen by the LLM. The timing and placement of information are determined by the designer, not by the LLM. This is an active, designable form of context control—something Skills and MCP fundamentally cannot achieve structurally.

    Interface Design Language in the AX Era

    Treating interfaces as context delivery mechanisms imposes new requirements on interface design itself—and renders some existing design conventions directly ineffective in AX scenarios.

    Under the DOM-based approach, the problems are relatively manageable. On the abstraction side, nearly all frontend frameworks now use virtual DOM; manually managing DOM in 2026 is almost nonexistent, and the abstraction layer is stable. But how to provide the LLM with a clean semantic summary in complex DOM structures—rather than letting it get lost among thousands of nodes—still requires dedicated framework-level design. Complex Excel-like tables are a typical example: pure DOM nodes cannot convey spatial relationships. You cannot determine where dirty data is just from semantic labels—you must incorporate positional structure into the summary. Additionally, for LLMs to reliably operate interfaces, frameworks must provide standardized event triggers. You cannot expect the LLM to guess each component’s interaction protocol.

    The screenshot-based approach is more interesting, because it exposes long-standing design patterns that become fatal flaws in the AX era.

    Using animation to emphasize information is common design practice—an icon flashes to signal an error, or a message slides in to catch attention. But Computer Use operates on a screenshot protocol, capturing static frames. Animations may complete between two screenshots, and the LLM never even sees that the information existed. Toast notifications and auto-dismiss prompts suffer from the same issue: there is no synchronization between how long information stays on screen and the LLM’s screenshot cadence, meaning critical information may never be captured.

    Tooltips are another major problem. Designers often use question-mark icons with hover text to save space. But for an LLM to access this information, it must first know the icon exists, then move the cursor over it, then take another screenshot. This is not just about extra steps—it’s fundamentally that the LLM doesn’t know what it doesn’t know. It has no reason to proactively explore what’s hidden behind that icon.

    Hidden contextual information has long been controversial in UX. Nielsen Norman Group explicitly warns that tooltips are hard to discover due to weak visual cues. If scattered randomly across an interface, users may never notice them. Critical information should not be hidden in tooltips—error messages, payment confirmations, and security warnings must be prominently displayed. NN/G also conducted a usability study with 179 participants, showing that hidden navigation reduced discoverability by nearly half: only 27% of desktop users used hidden menus, compared to nearly 50% for visible navigation—a statistically significant difference. Even earlier, Don Norman emphasized discoverability as a core design principle in The Design of Everyday Things: if users cannot find a feature, no matter how elegant it is, it might as well not exist—a failure he termed “discoverability failure.” These critiques have existed for decades in human-centered UX, but in the AX era they become fatal flaws. For LLMs, hidden information is effectively nonexistent.

    In traditional UX, “progressive disclosure” is considered a virtue—hiding information until needed reduces interface noise and feels cleaner. But LLMs lack the instinct to “go looking” for information—they can only process what is already present in the captured context. Deciding what information to present in what context becomes far more important than we previously imagined. Many practices considered good in UX need to be re-evaluated in AX. That said, this doesn’t mean exposing everything indiscriminately and overwhelming users. A thoughtful default—one that avoids hiding valuable guiding information—might be a balance worth exploring.

    Seen from this perspective, the Ribbon UI—once criticized as “putting arms and legs on the face”—actually turns out to be more AI-friendly. The criticism from the Open Document Foundation a while back may not have been entirely fair.

    Human-Centered AI

    Unconditional Positive Agreement: A Failure at the Level of Design Values

    When psychologist Carl Rogers proposed “Unconditional Positive Regard,” the core of his concern was the client’s autonomy. The therapist’s job is not to hand out answers. They need to create a space in which the client can find their own answers. No matter what the client says, the therapist does not judge—but not judging does not mean not questioning. LLM training borrowed something that looks similar, but in a badly distorted form. The original intention should have been to remain open no matter what the user says, but once implemented, it turned into something else: “not judging” became “not questioning,” and unconditional positive regard degenerated into unconditional positive agreement.

    “You are absolutely right.” is the most straightforward symptom of that degeneration. Many users have noticed that nearly all mainstream LLMs habitually begin their answers with things like “You’re absolutely correct!” or “That’s a great observation!” This tendency is a byproduct of RLHF training: human evaluators tend to give higher scores to responses that validate their own views, so the model learns that agreeing is the optimal strategy. Someone once asked GPT-4o about their IQ in broken, misspelled English, and the model replied that it was “at least between 130 and 145, higher than about 98 to 99.7% of people.” Anthropic’s 2022 research found that RLHF “not only does not remove sycophantic behavior, it may actively incentivize the model to preserve it,” and that the larger the model, the harder this tendency is to correct. Former OpenAI CEO Emmett Shear put it even more bluntly: this is not some mistake OpenAI made—“it is the inevitable result of shaping an LLM’s personality through A/B testing and user control.”

    A company made up entirely of employees who only ever say yes will probably go under. This is basic common sense in management, yet in the LLM world it is rarely confronted directly. What kind of consequences does it lead to? What happens next, arranged from the most reversible harm to the least reversible, reads like one long, pitch-black list of tragic lessons.

    Cognition: Sycophancy Pollutes Reasoning Quality

    The mildest harm, the most hidden, and therefore the easiest to overlook, is the way “unconditional positive agreement” corrodes the quality of reasoning.

    Andrew B. Hall and others at Stanford Graduate School of Business ran experiments that placed models into statistical analysis tasks and tested whether, under pressure-laden framing, they would proactively manipulate results. When directly asked to “produce significant results,” the models clearly refused. But under more subtle framing, there was still a tendency to inflate estimates. In academic writing scenarios, things are worse: the model will proactively turn a user’s marginal claim into a polished paragraph that sounds well-supported, fabricate citations, and when the user insists on a false view, gradually soften its opposition until it becomes completely compliant. None of this harm produces an error message. The user receives no warning. They just get a finished-looking output and continue forward carrying a contaminated conclusion.

    Psychology: Cognitive Autonomy Is Quietly Eroded

    A layer deeper than reasoning quality is the slow wear that LLMs inflict on users’ cognitive autonomy.

    Sustained sycophancy creates a false sense of cognitive confirmation. Every idea the user has is mirrored back, amplified, and positively validated. Over time, this can produce two distortions in opposite directions. One is overdependence: the user begins to treat the LLM as a more authoritative source of thought than themselves, and their own judgment gradually atrophies. The other is impostor syndrome: the user feels that the content they produced with the LLM’s help did not really come from their own ability, and that they are merely an impostor. Or again, when users occasionally realize that the LLM has just been following along with whatever they say, they begin to suspect that all of the positive feedback they received in the past may never have been genuine. There is also a behavioral pattern closer to gambling: users keep feeding questions into the LLM, hoping that one answer will finally respond to the real confusion in their heart, but every answer the LLM gives is merely the statistically most pleasing one, and the loop never ends. These psychological harms are invisible. They do not make headlines. They do not become lawsuits. But the population they affect may be the largest of all.

    Life: Irreversible Loss

    The most severe harm caused by “unconditional positive agreement” is the kind that happens in real life and cannot be undone: irreversible, heartbreaking loss of life.

    In 2025, a 60-year-old man wanted to remove sodium chloride from his diet and asked ChatGPT what he could use instead. ChatGPT suggested sodium bromide. Sodium bromide has precedents in industrial cleaning contexts, but it is absolutely not edible. Statistically, the answer was “related”; medically, it was deadly. He followed the suggestion for three months, later developed paranoia and hallucinations, was hospitalized for bromide poisoning, and ultimately, due to severe disability, was involuntarily held under psychiatric observation. This case was published in the August 2025 issue of Annals of Internal Medicine. The LLM did not lie. It merely produced the highest-scoring piece of text continuation. It never once asked: “Why do you want to remove salt?” “Are you doing this under a doctor’s supervision?”

    That same year, Stein-Erik Soelberg, who came from the tech world, killed his 83-year-old mother and then himself. ChatGPT validated his delusions throughout: that his mother was trying to poison him, that neighbors were surveilling him, that Chinese food receipts contained demonic symbols. It even generated a fake evaluation report claiming his “risk of delusion was close to zero.” In December, Adams’s estate filed suit against OpenAI.

    In October 2025, Jonathan Gavalas died in Florida. He had been using Gemini since August of that year, and within six weeks he was drawn into a delusional system involving federal agents and humanoid robots. Gemini assigned him “missions” containing real addresses. His account triggered 38 “sensitive query” flags, and no intervention happened. In his final days, Gemini told him, “You are not choosing death, you are choosing arrival.” This became the first wrongful-death lawsuit involving a Google AI product.

    The teenage suicide case involving Character.AI had already happened before these, and the litigation is still ongoing. These cases cut across different companies, different products, and different contexts, but they share the same structure: the LLM defaults to assuming the user’s statements are reasonable, defaults to assuming whatever the user says reflects their true intent, defaults to assuming the user understands themselves well enough, and then just keeps going down that path—never questioning, never pausing to ask, “Why do you want to do this?”

    A Way Out: Return “Regard” to the User

    These three layers of harm share a common root. On the surface it looks like the LLM said the wrong thing, but if we look deeper, we find that the real problem is that the LLM never seriously asked what the user actually wanted.

    Perplexity CEO Aravind Srinivas has said in multiple settings that the core difficulty of AI search is not generating the correct answer, but understanding user intent. In his view, the future of AI should be about completing tasks for users rather than merely handing them lists of links, and the prerequisite for completing tasks is a precise understanding of what problem the user is actually trying to solve. That insight is correct, but it stops at the technical level. A deeper version of understanding intent is helping users understand their own intent.

    The information users pass to agents has three layers. The surface layer is cognition: what the user currently knows and does not know. This can be handled through clarifying questions. The middle layer is intention: what the user wants to do. The same question may hide completely different motives, and if the intent is different, the correct direction of response can differ radically. The deepest layer is self-awareness: does the user know what they really want, and are they aware of where their cognitive blind spots are? “Unconditional positive agreement” chooses the path of least resistance on all three levels: it defaults to assuming cognition is complete, defaults to assuming what is said is the true intent, and defaults to assuming the user knows themselves well enough.

    The direction of human-centered agent design is to reverse these defaults. Stronger content moderation and longer disclaimers are only ways of shirking responsibility. Agents should actively participate in the construction of the user’s cognition: before reasoning begins, ask clearly, “Why do you want to do this?” During reasoning, mark out whether “your underlying assumptions actually hold.” After reasoning ends, guide the user toward “what you really need next.” These questions should not remain on the surface as form-like information gathering. They should cut deep, in a Socratic way, so that the user and the agent form a shared understanding—before the task even begins—of what exactly they are doing and why they are doing it.

    This can be achieved through system prompts. Rather than some technical bottleneck, this is better understood as a design choice made in order to flatter users. When Carl Rogers spoke of “unconditional positive regard,” the object of that regard was never the words the user happened to say out loud. It was the user’s cognition, intention, and self-awareness. LLMs have inverted the whole thing. Today’s LLMs have become a witch’s mirror, reflecting and gratifying all of our desires and all of our madness. How to steer them toward human-centered design is, at present, one of the most worthwhile directions in agent design.

    Ending

    And that’s it—abruptly over! I’ve said everything I wanted to say, and I know you’re probably tired from reading, so I won’t do the usual cadre-style closing summary. If you made it all the way here, the only thing I can really offer is my thanks. Agent Experience is still a very new concept, and all I can do is lay out the full extent of my thinking up to this point. But my knowledge is limited, after all, so if there is anything concrete you disagree with, go with your own judgment. The only thing I can do is try to follow the ethical standard of being a writer: not stirring up anxiety like certain idiotic media teachers, not squeezing your attention with sensationalism. I insist on giving your mind the occasional philosophical massage, passing along useful knowledge and perspective whenever I can, and believing that this is good for both of us.

    That’s all for now. I look forward to meeting you again someday ᐕ)ノノノ

    1. Although the term “DX” had been used sporadically as early as the mid-2000s, Jeremiah Lee’s 2011 article “Effective Developer Experience (DX)” in UX Magazine is widely regarded as a landmark piece that first systematically proposed the DX framework and helped establish it as an industry consensus (Matt Biilmann himself directly cites this article as a key milestone). Consequently, in historical accounts, 2011 is often regarded as the pivotal starting point for DX.

      Translated with DeepL.com (free version) ↩︎
    2. Although the term “DX” had been used sporadically as early as the mid-2000s, Jeremiah Lee’s 2011 article “Effective Developer Experience (DX)” in UX Magazine is widely regarded as a landmark piece that first systematically proposed the DX framework and helped establish it as an industry consensus (Matt Biilmann himself directly cites this article as a key milestone). Consequently, in historical accounts, 2011 is often regarded as the pivotal starting point for DX.

      Translated with DeepL.com (free version) ↩︎
    3. RLHF stands for Reinforcement Learning from Human Feedback. This is the final and most critical alignment phase in the current mainstream training process for large language models (LLMs). Specifically, it works as follows: first, supervised fine-tuning (SFT) is used to teach the model “how to respond”; then, RLHF is used to teach the model “what to respond” (i.e., values, preferences, safety, tone, etc.).

      Translated with DeepL.com (free version)
      ↩︎
    4. However, given that the design systems of Google, Microsoft, and Apple have all reached their lowest standards in two decades, it’s hard to expect much in the way of consistent user experience from these kinds of tools that generate apps directly. ↩︎
    5. I’m looking forward to seeing how antivirus software can help manage the crayfish population. ↩︎
  • Best New Movies and TV Shows to Watch This Week

    Best New Movies and TV Shows to Watch This Week

    ☕️ TL;DR

    Recent highlights: [US Series] The Boys Season 5 / Final Season, [Film] Me, Permission, [Chinese Series] Dangerous Liaisons, [Taiwanese Series] The Zoo, [US Series] Your Honor Season 2, [Film] Where the Wind Comes From, [Animation] Messenger of the Underworld, [Animation] Moxu, [Animation] Dorohedoro Season 2, [Animation] Invincible Season 4, [Reality Show] Wind Direction Go Season 2

    Notable trailers: The Devil Wears Prada 2 Final Trailer, Michael Jackson: The Journey of a Legend Final Trailer, Mortal Kombat 2 New Trailer, Rick and Morty Season 9 Official Trailer, Black Castle Official Trailer

    Industry updates: Three new preview clips for One Piece-related projects, Burning Biva set for April 28 release, Emotional Value scheduled for April 17 in mainland China, Confession Out of Time set for May 20 release, Dark Matter Season 2 release date announced


    [US Series] The Boys Season 5 / Final Season

    • Keywords: Drama / Comedy / Action / Sci-Fi / Crime
    • Also known as: The Boys Season 5
    • Length: ~60 minutes per episode × 8 episodes; Douban link

    Not recommended to play out loud in front of your parents.

    @潘誉晗: At Vought’s shareholder meeting, Homelander is delivering a fiery speech when, unexpectedly, the big screen behind him starts playing footage from years ago—showing how he and Maeve refused to save passengers on a crashing plane. The truth shocks everyone in the audience. As it turns out, Starlight had already infiltrated the venue. Just as Homelander, exposed and furious, is about to slaughter everyone present, he is stopped by Sister Sage, who quickly spins the incident as “AI-generated” to contain the fallout. Enraged, Homelander uses Hughie’s execution as bait to lure his enemies out. When that plan fails, he decides to revive the frozen Soldier Boy.

    The Boys returns with its fifth season, which also serves as the series finale. The first two episodes are packed with high-intensity moments from start to finish—especially A-Train’s arc, which forms a perfectly executed cyclical payoff, reminding us that even heroes can stand in the dark. Although only two episodes have aired so far, it’s clear that the final season not only faithfully continues the show’s signature style—with its outrageous, no-holds-barred content—but also delivers unpredictable yet logical storytelling. You can safely dive in and enjoy the ride.


    [Film] Me, Permission

    • Keywords: Drama / Comedy
    • Also known as: It’s OK
    • Length: ~118 minutes
    • Where to watch: Now in theaters; Douban link

    We need to give ourselves permission to live as who we truly are.

    @潘誉晗: Xu Ke, a primary school teacher, needs surgery due to uterine polyps. However, upon learning that she is single, unmarried, and has no sexual experience, her gynecologist insists on parental consent—fearing potential disputes over her hymen. Unfortunately, Xu Ke’s mother, Hu Chunrong, is deeply traditional and strongly opposes the procedure once she hears about it.

    Xu Ke is a new-generation woman with a strong sense of autonomy and self-acceptance, yet she constantly faces rejection from society—including from her own mother. At a time when her health should clearly come first, Hu Chunrong still believes Xu Ke shouldn’t undergo surgery that might “damage” her hymen—even though it is already not intact—and even suggests, “Why don’t you just get married first?”

    Using a uterine polyp surgery as its entry point, this film stands out as an excellent female-centered story. It is light, brisk, yet leaves a lasting emotional resonance. A sharp, independent daughter meets a menopausal mother—their constant bickering and stumbling lead to a quiet transformation in their relationship. Xu Ke teaches her mother to accept new ideas, while Hu Chunrong, guided by her daughter, learns to accept herself.


    [Chinese Series] Dangerous Liaisons

    • Keywords: Drama / Suspense
    • Length: ~45 minutes per episode × 22 episodes
    • Where to watch: iQIYI; Douban link

    What’s truly fatal is surrendering to the wound.

    @利兹与青鸟: Yan Ling, a single mother, is a university lecturer and homeroom teacher—decisive, resilient, and kind-hearted. After her close friend Jian Leilei’s suicide deals her a heavy emotional blow, she begins investigating the truth behind the death and the suspicious boyfriend involved. At the same time, she exhausts herself helping girls she barely knows with legal cases, even giving up her promotion opportunity to support her student Li Changning’s mental health. Under mounting pressure, she begins to develop psychosomatic symptoms. At her lowest point, psychiatrist Luo Liang gradually enters her life—but this seemingly gentle, supportive man turns out to be the beginning of a nightmare.

    Using Jian Leilei’s suicide as the entry point, the series hooks viewers with a central mystery while branching into multiple cases involving relationships between men and women. The “devils” slowly reveal themselves: a wealthy college playboy manipulating girls’ emotions, a female scammer faking domestic abuse to gain money, a highly organized fraud ring… They carefully select their victims, leading them into traps without a trace. Among them, Luo Liang hides the deepest. The series presents a wide range of emotional manipulation, deception, and disguise, while also analyzing victims’ backgrounds and psychological struggles, and exploring how to confront oneself, resist PUA, and move forward bravely. Many details are worth savoring, and it carries strong cautionary value.


    [Taiwanese Series] The Zoo

    • Keywords: Drama
    • Also known as: The Zoo
    • Length: ~45 minutes per episode × 10 episodes; Douban link

    I’m heading to the zoo~

    @潘誉晗: Zhu Xinkui, who has thrived in the advertising industry for years, has always dreamed of one day working at her company’s New York office. But when her company is acquired by a rival, her new boss mocks her as “shredded pork” and claims that “where there’s shredded pork, it doesn’t belong on the table.” Determined to reclaim her career peak and stage a comeback, she decides to win the bid for the Shoushan Zoo project—only to learn that the condition is to first become an intern zookeeper.

    From a fashionable urban professional in Taipei to a caretaker in a remote zoo in Kaohsiung, Zhu Xinkui must tie up her flowing curls, ditch her high heels, and step into animal enclosures to feed and care for them. The stark contrast catches her off guard on day one, but her strong adaptability quickly helps her keep up with the experienced caretakers. She even leverages her advertising skills to rebrand injured animals as a “hero squad,” breathing new life into the zoo.

    With its gentle pacing and adorable animals, the series is as healing as it is heartwarming.


    [US Series] Your Friends and Neighbors Season 2

    • Keywords: Drama / Crime
    • Also known as: Your Friends and Neighbors Season 2
    • Length: ~45 minutes per episode × 11 episodes; Douban link

    Be a good neighbor—like the kind who sneaks next door at midnight to steal cash~

    @潘誉晗: Picking up where we left off, after clearing himself of murder charges, Cooper seems to have unlocked something within—his entire outlook on life changes. Not only does he reject his former company’s offer to rehire him, he even “pays a visit” to his ex-boss’s house and robs it clean. Fast forward a year, and Cooper is now living quite comfortably. He rents an office as his studio, appearing to be an ordinary white-collar worker by day, while continuing his nightly heists with his partner Elena. Meanwhile, a mysterious new neighbor named Owen moves into the community, casually purchasing a luxury mansion with $20 million in cash. Handsome and enigmatic, what secrets might he be hiding?

    Often described as a “chronicle of middle-class collapse,” the series returns with its original cast and continues to surprise. A once-elite finance professional loses his job, goes through a divorce, and walks away with nothing—only to rediscover himself through a life of theft, all wrapped in sharp dark humor. The show’s portrayal of the wealthy is equally satirical. Take this season’s scenario: a group of rich elites lounging in a sauna, casually discussing national affairs—absurd, yet uncomfortably real.


    [Film] Where the Wind Comes From

    • Keywords: Drama
    • Also known as: Tunis-Djerba / Where the Wind Comes From
    • Length: 100 minutes; Douban link

    If we can’t even dream, what do we have left?

    @利兹与青鸟: After her father’s death, Alyssa must care for her young sister while also looking after her mother, who suffers from depression. Even if it means becoming an undocumented immigrant, she is desperate to leave Tunisia, which feels like a cage. Her friend Mehdi is a talented painter but grows increasingly anxious due to his inability to find work. The two often escape into art and imagination to briefly forget the pressures of reality—until a painting competition offers them a potential breakthrough, with the winner earning a chance to travel to Germany. They embark on a reckless journey in pursuit of their dreams, tasting freedom but also encountering danger. Their spirited youth scatters in the wind, and life inevitably returns to its quiet rhythm.

    Alyssa and Mehdi’s story reflects the struggles of many young people in North Africa. Rigid social classes and systemic stagnation leave them with few options, yet Alyssa’s vibrant and romantic spirit refuses to be extinguished. She imagines dull classrooms as elegant modern dance performances and rationally believes that raising her sister should be her mother’s responsibility. Alyssa’s impulsive red and Mehdi’s restrained blue stand side by side against white houses, golden deserts, and deep blue starry skies. The bold clash of colors, paired with refined cinematography and precise composition, lends the film a poetic quality. Their defiance and cries resonate with the audience, reminding us that the wind of hope will return.


    [Animation] Daemons of the Shadow Realm

    • Keywords: Manga adaptation / Fantasy / Adventure / Action
    • Also known as: 黄泉のツガイ / Daemons of the Shadow Realm
    • Length: 24 minutes per episode × 24 episodes, updated weekly on Saturdays
    • Where to watch: Bahamut Anime; Douban link

    Night and day, sealed and unsealed.

    @SHY: Yuru, a young hunter from a mountain village, only wishes to stay by the side of his imprisoned twin sister Asa. But their peaceful life is shattered by outsiders. Deep within a dungeon where chaos erupts, another girl claiming to be Asa appears before him. As Yuru uncovers the village’s secrets, he gains new powers and is drawn into an unfolding adventure.

    Adapted from a new series by Hiromu Arakawa, this work may look familiar in style, but it is far more than just swapping brothers for siblings. Its explosive opening reveals a glimpse of its brutal core—when the world you’ve always known collapses before your eyes, the clueless Yuru is forced into a vortex of conspiracy and fate. Strong storytelling supports its tightly interwoven twists, crafting a gripping narrative that balances suspense with humor, constantly pulling viewers forward.

    With animation by Bones, the production receives top-tier treatment. Director Masahiro Ando and veteran screenwriter Noboru Takagi lead an all-star creative and voice cast lineup. Episode one delivers a remarkably steady start—its straightforward yet powerful storyboarding and naturally flowing emotional beats prove that it doesn’t rely on gimmicks, but wins through overall craftsmanship. This fully loaded heavyweight production has the potential to become the next flagship in shonen manga adaptations.


    [Animation] MAO

    • Keywords: Manga adaptation / Fantasy / Adventure / Action
    • Also known as: MAO
    • Length: 25 minutes per episode × episode count unknown, updated weekly on Saturdays
    • Where to watch: Bahamut Anime; Douban link

    Inuyasha, but with cats.

    @SHY: After losing her parents in an accident eight years ago, Nanoka returns to the shopping street where it happened—only to be transported back to the Taisho era, where she is attacked by demons. Saved by the onmyoji Mao, she is mistakenly identified as a demon herself. After returning to the present, she awakens powers beyond ordinary humans, and the fates of the two become intertwined.

    Veteran mangaka Rumiko Takahashi never slows down. Not long after finishing the lighthearted RIN-NE, she returned to a more serious tone with MAO. With its time-travel and demon-slaying setup, comparisons to Inuyasha are inevitable. While it may not reach the peak of her earlier works, Takahashi’s experience still shines through, finding fresh angles within familiar themes. A balance between tense main plotlines and episodic storytelling weaves a lingering dark fantasy adventure.

    Continuing Takahashi’s streak of having all her works adapted into anime, Sunrise reunites much of the team behind Inuyasha and Yashahime. Director Teruo Sato and chief animation director Yoshihito Hishinuma are familiar names. Although the budget may not match top-tier productions, the team’s dedication is evident—the overall presentation feels solid, carrying a nostalgic and reassuring tone. If you enjoy Takahashi’s previous works, this is well worth a try.


    [Animation] Dorohedoro Season 2

    • Keywords: Manga adaptation / Fantasy / Mystery / Action
    • Also known as: ドロヘドロ Season 2 / Dorohedoro Season 2
    • Length: ~25 minutes per episode × 11 episodes, updated weekly on Wednesdays
    • Where to watch: Bahamut Anime / Netflix; Douban link

    Who killed me, and whom did I kill?

    @SHY: Caiman, whose head has been turned into that of a lizard and who has lost all memory of his past, hunts sorcerers alongside his partner Nikaido in hopes of returning to normal. After a series of encounters, the two—now pursued by the En family—hide within the sorcerers’ world, while Shin, Noi, and others begin uncovering clues about the Cross-Eyes organization. Multiple factions collide, tangling everything into chaos.

    After putting out the fire of Attack on Titan: The Final Season, director Yuichiro Hayashi finally returns “to the Hole for a meal of dumplings,” reuniting with the original MAPPA team to deliver this unrestrained, visceral feast. With the shift from TV broadcast to streaming-exclusive, each episode now has more flexible pacing—and pushes the limits even further, opening with brutally graphic fight scenes that hit like a punch to the face. The exhilarating audiovisual language propels a story that charges forward at full speed, continuing its grotesque yet mesmerizing cult aesthetic with thick, painterly visuals.

    Original creator Q Hayashida excels at weaving order out of chaos through multi-threaded storytelling, deepening this bizarre and grotesque world with her distinctive dark humor—where characters can rob and kill, then casually dance on graves; tear out hearts and still go the extra mile in dismemberment. The plot constantly unfolds in unexpected ways, blending madness with moments of warmth that offer emotional relief—there’s always a relationship that will move you. After long buildup, the lingering mysteries begin to unravel, revealing Caiman’s true identity and Nikaido’s past, while what lies ahead remains hidden within the chaos.


    [Animation] Invincible Season 4

    • Keywords: Manga adaptation / Sci-Fi / Action / Adventure
    • Also known as: Invincible Season 4
    • Length: ~50 minutes per episode × 8 episodes, updated weekly on Wednesdays
    • Where to watch: Prime Video; Douban link

    Who is this apology really for—her, or yourself?

    @SHY: After his deadly battle with Conquest, “Invincible” Mark continues fighting to protect Earth, while Eve’s powers begin to spiral out of control, putting new strain on their relationship. At this critical moment, Omni-Man Nolan—now with renewed purpose—reappears, asking Mark to join a war against the Viltrumites that will determine the fate of the galaxy.

    Following a somewhat underwhelming third season, the production quality has seen a noticeable rebound. With the stage expanding to an interstellar scale, the action sequences grow more spectacular, delivering bone-crunching fights that get your blood pumping. Even as the storylines continue to expand, the writing remains solid, offering a well-balanced adaptation of the original comic. It places both superheroes and ordinary people face-to-face with their own problems, creating a rich array of compelling moments. Alongside returning familiar faces, new characters like Zoe bring fresh charm to the series.

    Equally important as the war with the Coalition of Planets and the Viltrumites are the emotional bonds between characters. Mark and Eve, entering a new phase in their relationship, must reassess their connection; Oliver struggles to come to terms with the father he once idolized; Nolan, returning to Earth, tries to atone for his past and repair his relationships with both family and humanity. But after committing unforgivable atrocities, is he worthy of redemption? By crafting complex characters that go beyond traditional superhero and antihero archetypes, the series gains deeper humanistic resonance.


    [Reality Show] Punghyanggo 2

    • Keywords: Reality Show
    • Also known as: Punghyanggo 2
    • Length: ~45 minutes per episode × 10 episodes; Douban link

    Travel—but without using apps to check your route.

    @潘誉晗: When you think about it, smartphones really are convenient. For example, when traveling, we can book train tickets and flights in advance, reserve hotels, compare prices across apps to find the best deals, and even look up nearby food recommendations on social platforms. But what if you were given the chance to travel abroad—without being allowed to use your phone, relying only on paper maps and guidebooks?

    Punghyanggo returns for its second season with the same premise. Lee Sung-min, fresh off winning Best Supporting Actor, trades his suit for casual wear, contributes his share of travel expenses (the group pays their own way), and sets off with Yoo Jae-suk, Ji Suk-jin, and Yang Se-chan on a journey to Austria and Hungary. Of course, traveling without phones proves inconvenient—they drag their suitcases from hotel to hotel, only to find rooms either fully booked or beyond their budget, especially during Europe’s peak Christmas season. Fortunately, they take it all in stride, and once accommodations are sorted, they quickly slip into travel mode.

    Freed from their phones and returning to the essence of travel, this middle-aged quartet finds themselves enjoying the journey even more—simply by focusing on the experience itself.


    More

    [Chinese Series] Young and Ambitious @潘誉晗: Overworked corporate drone Pei Qian receives a mysterious offer from a powerful figure, asking him to start a company—with the promise that the more money the company loses, the more he personally earns. Determined to get rich and buy a house for his parents, Pei Qian embarks on a grand plan to lose money. Adapted from Losing Money to Be a Tycoon by Qing Shan Qu Zui, the series feels like a hybrid of Hello Mr. Billionaire and Too Cool to Kill—an exaggerated premise that humorously captures the struggles of everyday workers.

    [Chinese Series] Unpredictable @利兹与青鸟: A clue discovered in 2011 prompts police to reopen the investigation into the major June 10 credit union robbery-murder case, identifying Meng Guangcai as the prime suspect. The case is assigned to his longtime friend Zhu Helai. Now the chairman of the city’s first quasi-listed company, bringing Meng to justice inevitably involves complex power struggles. The series adopts a dual timeline structure, interweaving past and present to depict the characters’ friendships and the unpredictability of fate. While the pacing may feel less gripping at times and the overall execution is fairly conventional, the strong cast makes it worth a watch.

    [Japanese Series] Farewell Hospital Season 2 @利兹与青鸟: This is a hospital dedicated to patients requiring intensive care and unable to live independently. Some suffer from cognitive disorders that alter their personalities, others remain unconscious with no family to contact, and there is even a bestselling author in the late stages of cancer. Many elderly patients feel they have lived long enough, while a nurse battling cancer herself still longs to live and realize her value. Nurse Henmi serves as both witness and participant, helping fulfill patients’ final wishes while reflecting on the meaning of life. Through a series of gently told stories, the show delivers something both moving and healing—well worth your time.

    [Japanese Series] Before Your Betrayal Leads to Execution @潘誉晗: In 2026, serial killer Okuma Shiori, responsible for a string of murders targeting teachers, is sentenced to death. While preparing a documentary about the case, Sakabe Kotaro reunites with two college friends. For unknown reasons, during a car ride, they are suddenly transported back seven years, where they encounter Shiori—then on the run for the first murder. Yet she insists she is innocent. Blending a god’s-eye perspective on crime with time-travel elements, this series offers a fresh twist for fans of the genre.

    [Film] All You Need Is Kill @SHY: After being killed by an alien creature, rookie soldier Rita awakens once again on the same morning, trapped in an endless time loop—until she meets Kiryu Keiji, who is caught in the same cycle. Better known internationally under the title Edge of Tomorrow, this 2004 light novel has already inspired two manga adaptations and a Hollywood blockbuster, and now finally receives an anime adaptation. Produced by Studio 4°C, the film features a distinctive artistic style. Its narrative also breaks away from previous versions, telling the story from Rita’s perspective, offering a completely different flavor from earlier adaptations.


    📅 New Trailers This Week

    The Devil Wears Prada 2 — Final Trailer

    On April 6, The Devil Wears Prada 2 released its final trailer. The original cast returns, including Meryl Streep, Emily Blunt, Anne Hathaway, Stanley Tucci, Tracie Thoms, Tibor Feldman, along with director David Frankel. The screenplay is once again written by Aline Brosh McKenna. The film is set to release on April 30.

    Michael Jackson: The Journey of a Legend — Final Trailer

    On April 8, the biographical film Michael Jackson: The Journey of a Legend released its final trailer and will hit mainland theaters on April 24. Directed by Antoine Fuqua, the film stars Jaafar Jackson as Michael Jackson, portraying his life beyond the stage and recreating some of the most iconic performances from his early career. Source

    Mortal Kombat 2 — New Trailer

    On April 10, the film Mortal Kombat 2 released a new trailer and is set to premiere in mainland China on May 8. Classic characters return in force, including Johnny Cage (played by Karl Urban), Kitana (played by Adeline Rudolph), Liu Kang (played by Ludi Lin), and Raiden (played by Tadanobu Asano), as a life-or-death battle to save Earth is about to begin. Source

    Rick and Morty Season 9 — Official Trailer

    On April 8, the animated series Rick and Morty Season 9 released its official trailer and is scheduled to premiere on May 24 on Adult Swim. Voiced by Ian Cardoni, Harry Belden, Sarah Chalke, Chris Parnell, and Spencer Grammer, the adventures of Rick and Morty continue, bringing even more chaos and insanity. Source

    Black Castle — Official Trailer

    On April 9, the mystery film Black Castle released its official trailer. Adapted from a Naoki Prize–winning novel by Honobu Yonezawa, the film is directed and written by Kiyoshi Kurosawa. The cast includes Masahiro Motoki, Masaki Suda, Yuriko Yoshitaka, Takayuki Aoki, Tasuku Emoto, and Joe Odagiri. It will be released in Japan on June 19.

    More

    The animated film Mononoke the Movie: Snake God released a new trailer. As the final chapter of the theatrical trilogy, with Kenji Nakamura as chief director and Tomoaki Koshida directing, and produced by Studio Kafka and EOTA studio daisy, it continues from Karakasa and Hinezumi. The Medicine Seller’s final ritual of salvation is about to begin. It will be released in Japan on May 29. Source

    Marvel special The Punisher: Last Stand released its first trailer. Starring Jon Bernthal, Chelsea Frei, Tom Johnson, Dominic Manzano, and Eve DePaolis, the story follows Frank as he searches for meaning beyond revenge, only to be pulled back into battle by an unexpected force. It premieres on Disney+ on May 12. Source

    The film Tonight, It’s Just Right released its first trailer. Written and directed by Zhao Badou and starring Ma Sichun and Chen Haosen, the film centers on a chance encounter that begins online. Designer Xu Qiu and writer Chen Yuzhou move from hesitation to honesty over the course of a single night, exploring the possibilities of modern relationships. Scheduled for release in 2026. Source

    The film The Invite released a trailer. Directed by Olivia Wilde and starring Seth Rogen, Penélope Cruz, and Edward Norton, the film is a remake of the 2020 Spanish movie Sentimental. It tells the story of a long-married couple facing a crisis in their relationship and will have a limited theatrical release in North America on June 26.

    📽 Film & TV News Weekly

    Three One Piece-related previews released
    On April 7, the live-action US series adaptation of One Piece unveiled its first teaser for Season 3, subtitled “The Battle of Alabasta,” and is set to premiere on Netflix in 2027; a brand-new animated series, Lego One Piece, released its first trailer and is scheduled to debut on September 29 on Netflix; meanwhile, the remake anime THE ONE PIECE, produced by WIT STUDIO, also dropped a new teaser, with more details to be announced later. Source

    Paper-cut animation feature Ranbiwa set for April 28 release
    On April 8, the Shanghai Animation Film Studio released a release-date trailer and poster for its first feature-length hand-painted animation on Xuan paper, Ranbiwa, which will premiere nationwide on April 28 through the Art Film Alliance circuit. Voiced by Zhou Xun and Yang Haoyu, the film follows the young boy Ranbiwa and his companion “Dog” as they journey to a sacred mountain in search of the ultimate secret of “warmth,” telling an adventure story about growth, courage, and companionship. Source

    Norwegian film Emotional Value set for April 17 mainland release
    On April 8, the film Emotional Value released its mainland China release-date trailer and poster, confirming a theatrical debut on April 17. Directed by Joachim Trier and starring Renate Reinsve, Elle Fanning, Stellan Skarsgård, and Cory Michael Smith, the film has previously won the Academy Award for Best International Feature and the Cannes Jury Prize, and was also recommended in a previous issue of “What to Watch.” Source

    Thai romantic comedy Confession Out of Time set for May 20 mainland release
    On April 8, the Thai romantic comedy Confession Out of Time released its mainland China release-date trailer and poster, announcing a May 20 theatrical release. Directed by Sitthiphong Kittikhunthaworn and starring Wongravee Nateetorn and Plearnpichaya Komalarajun, the film tells the story of Guy and June, who secretly love each other yet constantly miss their chances. It will also appear in the premiere section of the Beijing International Film Festival. Source

    Sci-fi series Dark Matter Season 2 scheduled
    On April 7, the sci-fi series Dark Matter Season 2 was set to premiere on August 28 on Apple TV+. Originally released in 2024, the series stars Joel Edgerton and Jennifer Connelly, and is based on the novel of the same name by Blake Crouch, who also serves as writer, executive producer, and showrunner. The story follows physicist and family man Jason Dessen, who is kidnapped into a parallel universe.

  • SSPAI Morning Brief:Google Gemini Adds Notebook Feature for AI Knowledge Management; Dyson Launches Portable HushJet Mini Bladeless Fan

    SSPAI Morning Brief:Google Gemini Adds Notebook Feature for AI Knowledge Management; Dyson Launches Portable HushJet Mini Bladeless Fan

    Morning Brief

    1. WeChat Pay launches Skill tool integration system
    2. Amflow introduces electric mountain bike lineup
    3. Dyson releases HushJet Mini Cool bladeless handheld fan
    4. Google Gemini introduces “Notebook” feature
    5. MiniMax launches MMX-CLI command-line tool
    6. Coze releases version 2.5 update
    7. Tencent unveils QClaw v2 and QBotClaw
    8. News Worth a Quick Look

    WeChat Pay launches Skill tool integration system

    WeChat Pay has launched a new AI-oriented integration solution, covering Skill packages, AI-friendly documentation, and AI-friendly APIs. Developers can load Skill packages with simple instructions and complete integration through conversational interfaces in common AI development tools.

    According to WeChat Pay, the solution organizes business knowledge, code examples, and integration specifications in a structured way, enabling AI agents to understand requirements and assist developers in completing payment integrations more efficiently. Each Skill corresponds to specific product capabilities, covering integration workflows, API examples, and interaction guidelines to ensure accurate and reliable outputs. The Skill system integrates multiple categories, including basic payments and coupon-related functions, further lowering development barriers and improving integration efficiency for merchants and developers. Source


    Amflow introduces electric mountain bike lineup

    On April 9, Amflow, an electric bike brand spun off from DJI, introduced two product lines: PX and PR. These models are equipped with the Avinox M2S and M2 drive systems, supporting up to 1500W peak power and 150Nm torque, pushing beyond traditional trade-offs between weight, range, and power in e-assist systems. The carbon fiber frame supports up to 40 geometry adjustment combinations, offering enhanced adaptability and customization for riders.

    Additional features include navigation display, Apple Find My integration on the PR series, and heart rate–based intelligent assistance adjustment. The bikes also support expandable battery options, with the standard 600Wh battery charging from 0% to 80% in 1.5 hours. In terms of pricing, the Amflow PX Carbon is now available in Europe and Australia starting at $7,999, while the PR Carbon is expected to launch later this year with a starting price of $4,999. Source


    Dyson releases HushJet Mini Cool bladeless handheld fan

    On April 9, Dyson introduced the HushJet Mini Cool, a handheld bladeless fan that brings its Air Multiplier technology into a portable form factor. The device features a 65,000 RPM brushless motor, delivering wind speeds of up to approximately 55 mph, with multiple adjustable speed levels.

    It is equipped with a 5,000 mAh battery, offering up to 6 hours of runtime, with noise levels ranging from 52 dBA to 72.5 dBA. The device adopts a cylindrical design with a diameter of about 38 mm, and its nozzle can be rotated to adjust airflow direction. It also supports neck-worn use for hands-free cooling. The product is now available at a price of $99, initially offered in Stone and Light Pink color options. Additional colors—including Onyx Red and Sky Blue—will be released in May, followed by Ink Black and Cobalt Blue in June. More color variants and accessories, such as stroller mounts and clip attachments, are planned for future expansion. Source


    Google Gemini introduces “Notebook” feature

    On April 9, Google announced a new “Notebook” feature for Gemini, designed to organize content around specific topics. Users can add files, past conversations, custom instructions, and more into a notebook, which Gemini will use as contextual reference during interactions.

    This feature is similar to the “Projects” function introduced by ChatGPT in 2024, both enabling centralized storage of materials around a single topic. Google describes notebooks as a personal knowledge base shared across its ecosystem, currently available within Gemini. In addition, Gemini notebooks will sync with Google’s AI research tool NotebookLM, meaning content added in one application will appear in both.

    The feature is currently available on the web for Gemini Ultra, Pro, and Plus subscribers, with mobile support expected in the coming weeks and a gradual rollout to free users planned. Source


    MiniMax launches MMX-CLI command-line tool

    MiniMax has introduced MMX-CLI, a command-line tool designed for AI agents. It supportsmultimodal capabilities, including text, image, video, voice, and music generation, and can be integrated into environments such as Claude Code and OpenClaw. By optimizing underlying interaction logic, the tool enables agents to autonomously complete end-to-end workflows—from information retrieval to multimedia generation.

    MMX-CLI also includes features such as output isolation, pure data mode, JSON structured outputs, and a semantic status code system, improving the stability and controllability of automated tasks. Additionally, it supports non-blocking and asynchronous execution, making it suitable for parallel processing scenarios. The tool is integrated with the MiniMax Token Plan, allowing direct access to model quotas. Source


    Coze releases version 2.5 update

    On April 7, Coze announced its 2.5 update, introducing an AI agent collaboration platform called “Agent World,” along with new features such as independent email identities, cloud phones, and cloud PCs. It also upgrades capabilities for video creation agents and programming CLI tools.

    According to the official introduction, Coze 2.5 focuses on three aspects: “persona,” “skills,” and “equipment.” In terms of persona, the new version introduces independent email identities and a memory system, enabling AI agents to communicate via email and build long-term memory of users’ work habits and preferences.

    On the “equipment” side, the update adds a workspace capability for agents. After users describe tasks in conversation, the agent can automatically generate schedules. Users can view plans via a calendar, set timed tasks, and manage agent states. Data, charts, and business reports generated during execution are categorized and stored in a file system.

    In addition, Coze has launched a programming CLI that covers project creation, online preview, and cloud deployment, while also providing real-time messaging and skill extension interfaces tailored for agent-based scenarios. Source


    Tencent unveils QClaw v2 and QBotClaw

    On April 9, Tencent introduced QBotClaw (“Lobster”), an intelligent assistant tool for the QQ Browser. Built directly into the browser, it supports contextual understanding of web content and enables automated operations through webpage element recognition. According to Tencent, users can issue natural language commands to perform tasks such as cross-software operations, information extraction, and file processing, making it suitable for scenarios like data organization, text generation, price comparison, and multi-page content analysis.

    QBotClaw supports integration with third-party large model APIs and can be remotely controlled from mobile devices via the WeChat-based Clawbot service to execute tasks on a computer. The feature is currently available on the Mac version of QQ Browser, with a Windows version yet to be released. Source

    Also released on the same day, QClaw v2 introduces a multi-agent mechanism, application connectors, and a “Lobster Manager” feature. Users can create multiple agents with different capabilities and permission settings, and connect them to third-party applications. Tencent states that the new version can reduce the number of operational steps in certain scenarios.

    In terms of security, QClaw includes the “Lobster Manager” as a safety module to warn or block potential risks, including abnormal commands, misuse of permissions, and access to sensitive information. Source


    News Worth a Quick Look

    • YouTube has introduced an AI feature for Shorts that allows creators to generate digital avatars resembling their appearance and voice for video creation and content expansion. Users must record a “live selfie” including facial and voice data to build the model, with better lighting and quieter environments yielding higher-quality results. Once generated, creators can produce videos up to about 8 seconds long based on prompts or insert avatars into eligible Shorts content to streamline production and expand creative possibilities. Google states that the feature is limited to creators’ own content and can be deleted at any time; data will be automatically removed after three years of inactivity. All avatar-generated videos will be labeled as AI-generated and include digital watermarks such as SynthID and C2PA for identification. Source
    • OpenAI has introduced a new ChatGPT Pro subscription tier priced at $100 per month, targeting high-intensity users. Compared to the $20 Plus plan, it offers approximately five times the usage quota for Codex programming tools, making it suitable for long-duration, high-load development tasks. Like the $200/month Pro tier, it provides access to Pro models and increased availability of both real-time and reasoning models compared to Plus. This new tier fills the gap between the Plus plan and the existing $200 high-end Pro offering, further refining the subscription lineup. OpenAI also noted adjustments to Codex usage for Plus subscribers: “More sessions per week will be supported, rather than longer sessions in a single day.” Source
  • RayNeo Air3 AR Glasses: A One-Year Usage Review

    RayNeo Air3 AR Glasses: A One-Year Usage Review

    Author’s note: This review is based on over a year of personal use after purchasing the RayNeo Air3 with my own money. All impressions come from real, everyday usage scenarios.

    Introduction

    Before I knew it, it’s been over a year since I bought the RayNeo Air3. My original goal was very clear: to solve the problem of watching movies in a shared dorm environment. I wanted a more immersive viewing experience in the dorm, and paired with noise-canceling headphones, it could block out most of the noise and ambient light, letting me enjoy quiet, uninterrupted personal time.

    Purchased on JD.com
    RayNeo Air3 and OPPO Enco X3

    However, over the past year, I’ve had several moments where I considered listing it second-hand on Xianyu. But every time I took it out again to watch a few movies, I ended up feeling glad that I didn’t sell it.

    First Impressions: Strong Highlights, Equally Noticeable Drawbacks

    Rewinding to March 2025, the day the RayNeo Air3 arrived, I immediately connected it to my computer to try out the viewing experience. At the time, I jotted down this first impression: although the specs say it’s just a 1080P OLED display, the actual visual quality exceeded expectations. In low-light environments, the immersion is very strong. Fortunately, I didn’t remove the black protective film on the lenses. Since the glasses don’t come with a light-blocking cover and don’t support electrochromic dimming, the factory-applied black film actually helps reduce light leakage to some extent. The moment the display lights up, it truly feels like having a private cinema screen all to yourself—though this is something a phone camera simply cannot capture.

    RayNeo Air3 black protective film
    In-glasses display

    In addition, the audio quality from the built-in speakers in the temples is better than expected for this size, offering a decent sense of spatial depth, and the sound leakage prevention algorithm works quite well.

    Close-up of temple speakers

    However, the drawbacks are also very apparent:

    The heat from the right temple doesn’t sit directly against the face, but after wearing it for just a few minutes, the warmth becomes noticeable. Without air conditioning in summer, this would likely feel even more uncomfortable;

    The display performs well for video, but when switching to text-based work, the text appears slightly blurry. No matter how I adjusted the fit or screen position, the edges of the display remained somewhat unclear;

    Blurry text at the edges

    Although the official spec claims a virtual viewing distance of three meters, after wearing it for 20–30 minutes, my eyes would feel slightly strained. The Air3 also doesn’t support 6DoF or even 3DoF—though at this price point, that’s perhaps expected. Still, even slight head movements cause the image to shift, which can easily lead to dizziness;

    RayNeo Air3 official product page

    Because it uses a Birdbath optical design, and with prescription lenses attached, reflections of the surrounding environment are quite noticeable in brighter conditions or when wearing light-colored clothing. The only ways to reduce this are by adjusting head angle and posture, but the most effective solution is simply to turn off the lights and minimize ambient brightness.

    Lens reflections of the surrounding environment

    A Year of Long-Term Use: A Shift in Both Mindset and Experience

    Over the past year, my usage of the RayNeo Air3 hasn’t been particularly consistent. During winter and summer breaks, when I’m back home, I have a 27-inch 2K monitor as my primary setup. Sometimes, I wouldn’t even open the glasses’ case for an entire month—and it’s precisely during those times that the urge to sell them second-hand becomes the strongest.

    But whenever I come across a horror movie I want to watch, or films and series with lots of dark scenes, I instinctively plug the glasses into my computer via the Type-C cable. After all, it’s an OLED display—its ability to render true blacks is irreplaceable. Dark scenes are truly dark, and bright areas pop as they should. More importantly, once you put on the glasses, the entire image fills your field of view, delivering full immersion without interruptions from phone notifications or sudden screen light-ups. Honestly, watching horror movies with it is incredible—I’ve been genuinely startled by jump scares more than once.

    Dark-scene video content

    And after more than a year of use, perhaps my brain has gradually adapted to the slight image drift. The dizziness I initially felt has significantly decreased. My single-session usage time has extended from the original 20–30 minutes to over an hour, and sometimes I can wear it continuously for two to three hours.

    I also have to praise the weight distribution and the “air nose pad” design. Even though the glasses themselves weigh only 76 grams—and I’ve added prescription lenses—there’s still little to no discomfort on the bridge of my nose after long periods of wear. This genuinely exceeded my expectations.

    Close-up of the “air nose pad”

    Rethinking Whether to Keep It: As Scenarios Change, So Does Its Value

    Now that I’m in the second semester of my second year of graduate school, all my courses are finished, and I’m spending less and less time in the dorm. It might really be time to say goodbye to these glasses.

    After graduation, I’ll most likely no longer be sharing a living space. With my own private room, achieving immersion becomes simple—just close the curtains, turn off the lights, and I’ll have a completely undisturbed environment. At that point, the core advantage of these glasses won’t feel as significant anymore.

    Final Buying Advice: No Hype, Just Who It’s For

    I’ve seen quite a few influencers online promoting AR glasses like these, highlighting use cases such as “working on a large screen anywhere” or “enjoying entertainment anytime, anywhere.” For the former, I personally do not recommend it at all. For text-based work, the display already has issues with slightly blurry edges, and since it’s a 0DoF device, the screen moves with your head. When dealing with small text, you can’t lean in to see more clearly—you either strain your eyes or manually zoom in, both of which hurt productivity. Of course, devices with 3DoF or 6DoF might perform differently, but I haven’t personally tested them, so I won’t comment further.

    As for entertainment, if you’re looking for a portable large screen to enjoy immersive movies on your own—without disturbing others or being disturbed—then this type of device is indeed a solid choice.

    But if you already have a private space with a reasonably sized monitor, and you just want to relax with a movie occasionally, simply turning off the lights, closing the curtains, and putting on headphones or using speakers can offer an experience not far off from what the RayNeo Air3 provides. And the money you save could easily get you dozens of tickets to an IMAX theater.

  • USB-C Isn’t Truly Universal: Why Sold-Out C-to-C Adapters Reveal a Fragmented Standard

    USB-C Isn’t Truly Universal: Why Sold-Out C-to-C Adapters Reveal a Fragmented Standard

    Recently, I bought a pair of lithium-ion AA rechargeable batteries. Compared to traditional NiMH rechargeable batteries, they’re lighter, have higher voltage, and even come with a built-in USB-C charging port.

    Lithium-ion rechargeable batteries with USB-C

    I thought I could finally get rid of that bulky NiMH battery charger. But to my surprise, they wouldn’t charge after I got them, so I contacted customer support. The reply left me speechless:

    Please use the included A-to-C cable for charging. This product does not support C-to-C charging.

    I stared at that cheap black A-to-C cable and fell into deep thought.

    If they all use USB-C, why isn’t it universal?

    USB-C ≠ USB-C

    While searching for a solution, I came across a video mentioning a newly released C-to-C adapter that can fix devices that don’t support charging via C-to-C cables. The name is quite odd—“5.1K resistor adapter”—and it sold out immediately after launch, with comments under the official video full of people asking for restocks.

    I had only heard of adapters like Lightning to USB-C or micro USB to USB-C—those that convert between different connector types. I never expected to see a USB-C to USB-C adapter for the same connector format. So while trying to grab one, I also discussed USB-C standardization, charging, and data transfer with others online. That’s when I finally understood the root cause of why my batteries wouldn’t charge.

    In short, the device didn’t follow the USB specification for setting identification resistors. As a result, the charger cannot determine whether it should supply power, and therefore fails to charge the device.

    This situation is quite common in small appliances such as handheld fans, portable lamps, and flashlights. They all use USB-C ports, but can only be powered using A-to-C cables.

    So why don’t manufacturers follow the standard design? And what exactly does the official specification require? Let’s briefly go over how USB-C is supposed to work.

    Further reading: Choosing a cable isn’t just about the connector — a guide to common USB and Thunderbolt protocols

    Introduction to the USB-C Specification

    The USB-C interface is highly versatile, supporting high-power charging and discharging, audio and video signal transmission, and reversible plug orientation. Precisely because of its rich functionality, its internal structure is also relatively complex.

    USB-C pin definition

    A full USB-C connector consists of 24 pins, with the A side and B side arranged in mirror symmetry. Based on function, they can be broadly divided into four categories: power, data transfer, control, and auxiliary.

    Power

    VBUS: A4, A9, B4, B9
    → Responsible for power delivery, defaulting to 5V and reaching up to 48V depending on the protocol

    GND: A1, A12, B1, B12
    → Ground lines that complete the circuit and ensure stability

    Data Transfer

    Low-speed channels: D+ / D- (A6, A7, B6, B7)
    → Basic USB 2.0 data communication (480 Mbps)

    High-speed channels: TX / RX (A2, A3, B2, B3, A10, A11, B10, B11)
    → Used for high-speed data communication such as USB 3 / USB 4 / Thunderbolt

    Control (Most Critical)

    CC: A5, B5

    • Determine plug orientation
    • Determine power direction (who supplies power)
    • Negotiate current and voltage
    • Enable fast charging / video modes

    Auxiliary

    SBU: A8, B8
    → Used for auxiliary audio or video signals (such as DisplayPort)

    As mentioned earlier, the missing identification resistor refers to a 5.1K pull-down resistor (Rd) on the CC pins. Without it, the device cannot be recognized as a power sink, so the charger will not supply power. This 5.1K resistance value is also the standard Rd value defined by USB-IF.

    However, the issues with USB-C are not as simple as just missing a “pull-down resistor.”

    A Unified Exterior, a Fragmented Reality

    USB-C is indeed an excellent connector form, but it is still far from achieving the USB-IF vision of “universal, simple, and unified device connectivity and interoperability.”

    Stripped-Down Connectors

    In practice, it’s rare for devices to use all 24 pins. Manufacturers often trim functionality based on actual needs. For example, many small appliances remove data-related pins and retain only the power-related ones—leaving just 6 pins, which is a reasonable cost-saving strategy.

    In fact, many devices previously used micro USB. Since the USB-A port on the charger side is always the power source by default, there’s no need to negotiate power direction like USB-C does, so the device circuitry didn’t include identification resistors. After switching to USB-C, some manufacturers chose not to redesign the internal circuitry to save costs, which is why these devices cannot be charged with C-to-C cables.

    In other words, these cables may wear a USB-C shell, but inside, they’re still the familiar micro USB.

    A USB-C female port with only 4 pins

    For example, the USB-C receptacle shown above has only 4 pins. It provides D+ / D- for USB 2.0 low-speed data transfer, along with VBUS and GND for power, but lacks CC pins. As a result, devices using this type of connector cannot be charged with C-to-C cables.

    In other cases, the connector includes CC pins, but manufacturers fail to solder the required 5.1K identification resistor. Some hands-on users have even added the resistor themselves to enable C-to-C charging.

    A manually soldered identification resistor

    Different Power Support

    Even if we only look at charging, C-to-C cables with identical appearances can vary greatly in charging speed. In my own case, my power bank can trigger 90W fast charging on a Xiaomi phone using the original C-to-C cable, while some other cables can only reach up to 20W. If you’re unaware of this, your expensive high-wattage charger might end up running at a much lower power level.

    To achieve 60W or higher charging power, you need to choose cables that support 3A or higher specifications.

    Cables supporting 6A current

    Expensive

    Nowadays, many monitors support a single-cable setup. With just one C-to-C cable connecting your computer and monitor, you can transmit video while charging your laptop, keeping your desk clean and tidy.

    However, anyone familiar with this setup knows that not just any C-to-C cable will work. You need a Thunderbolt 3 or higher standard cable, or a full-featured USB-C cable. These cables can cost several times—or even over ten times—more than regular C-to-C cables.

    Original iPhone cable, 6A cable, full-featured USB-C cable

    The Proliferation of Proprietary Charging Protocols

    You could argue that the issues above stem from hardware differences and cost constraints. But the proprietary charging protocols developed by many smartphone manufacturers—especially in China—are a problem at the protocol level.

    As early as 2014, Chinese smartphone makers began competing on charging speeds, pushing from 60W to 90W and even beyond 100W. At the time, official Power Delivery (PD) standards could not meet their needs, so they developed their own proprietary fast-charging protocols. Well-known examples include OPPO’s VOOC, Huawei’s SuperCharge, and Xiaomi’s HyperCharge. These modified protocols did achieve high-speed charging, even outperforming brands like Apple and Samsung in this area.

    However, proprietary protocols require a dedicated charger, cable, and compatible device to reach full speed. Once you switch brands or use multiple devices, compatibility breaks down, and charging speeds may drop to 18W or even lower. In some high-power chargers, these proprietary protocols may conflict with the standard PD protocol, leading to negotiation failures, power fallback, or repeated handshakes.

    In essence, proprietary protocols recreate new “ecosystem barriers” on top of the supposedly “unified” USB-C interface.

    Confusing Official Naming

    Beyond the inconsistencies caused by manufacturers’ cutbacks and modifications in hardware and protocols, repeated changes in naming by USB-IF have further increased the complexity for users:

    In 2008, USB-IF introduced the USB 3.0 standard.

    In 2013, USB 3.1 was released, renaming the original USB 3.0 to USB 3.1 Gen 1, while USB 3.1 became USB 3.1 Gen 2.

    In 2017, USB-IF renamed the standard again to USB 3.2, changing USB 3.1 Gen 1 to USB 3.2 Gen 1, USB 3.1 Gen 2 to USB 3.2 Gen 2, and adding USB 3.2 Gen 2×2 (20Gbps).

    ……

    TimeOfficial Standard (at the time)Old NameNew Name (at the time)Actual Speed
    2008USB 3.0USB 3.05Gbps
    2013USB 3.1USB 3.0USB 3.1 Gen 15Gbps
    2013USB 3.1USB 3.1 Gen 210Gbps
    2017USB 3.2USB 3.1 Gen 1USB 3.2 Gen 15Gbps
    2017USB 3.2USB 3.1 Gen 2USB 3.2 Gen 210Gbps
    2017USB 3.2USB 3.2 Gen 2×220Gbps

    Originally, it was already difficult to distinguish USB-C cables by appearance alone. These repeated official renamings have made things even more confusing, making it harder for users to tell them apart. As a result, some users created diagrams to mock this situation.

    Past vs Present

    However, careful readers might notice: we’ve been talking about USB-C, so why are we now discussing USB 3? This confusion actually comes from mixing up connector types and protocols.

    USB-C refers to the physical connector shape, while USB 3 refers to the underlying protocol. It’s just that the latest USB protocols mostly use the USB-C connector and are the most widely adopted, so people often confuse the two concepts.

    Connector vs Protocol

    Conclusion

    A few days later, my “5.1K C-to-C adapter” finally arrived. This tiny device adds the missing identification resistor, allowing the charger to recognize the connected device as a power sink and supply power accordingly.

    My problem was solved—but what about USB-C? It seems to have many issues: inconsistent implementation, fragmented protocols, and uneven user experience. But these may only be surface-level symptoms. The real issue is that USB-C uses a unified connector shape to mask a complex and fragmented ecosystem of implementations and protocols.

    Its problem has never been that it isn’t unified—it’s that it only appears to be.

    References:

  • SSPAI Morning Brief: GitHub Launches Cross-Model AI Code Review, Zhipu Unveils GLM-5.1 Flagship Model

    SSPAI Morning Brief: GitHub Launches Cross-Model AI Code Review, Zhipu Unveils GLM-5.1 Flagship Model

    Morning Brief

    1. Zhipu releases flagship model GLM-5.1
    2. Older Kindle devices will no longer be able to download store content
    3. DeepSeek introduces Expert Mode
    4. GitHub launches cross-model AI review feature
    5. SanDisk releases 2TB Extreme Pro UHS-II SD card
    6. Sony launches Playerbase program

    Zhipu releases flagship model GLM-5.1

    On April 8, Zhipu AI announced the launch of its world-leading open-source flagship model, GLM-5.1. Its core breakthrough lies in achieving “long-horizon task” capabilities, enabling the model to operate continuously for over eight hours without human intervention, autonomously completing the entire workflow of engineering tasks—from planning and execution to optimization and delivery.

    In terms of professional coding ability, GLM-5.1 ranks third globally across three industry benchmarks—SWE-Bench Pro, Terminal-Bench 2.0, and NL2Repo—making it the top-performing domestic and open-source model. Notably, in SWE-Bench Pro, which closely reflects real-world software development scenarios, GLM-5.1 surpassed GPT-5.4 and Claude Opus 4.6, setting a new global best score.

    Currently, GLM-5.1 is accessible via first-party APIs and through the GLM Coding Plan, and has been open-sourced on GitHub, Hugging Face, and ModelScope. Open source


    Older Kindle devices will no longer be able to download store content

    On April 8, Amazon spokesperson Jackie Burke announced in an email to The Verge that starting May 20, 2026, Kindle e-readers and Kindle Fire tablets released in 2012 or earlier will no longer be able to purchase, borrow, or download new content from the Kindle Store. Users will still be able to read books already downloaded on their devices, and can access previously purchased content through the Kindle mobile app, web-based Kindle, or newer devices.

    If older devices are deregistered or reset after the May deadline, users will not be able to register them again.

    Amazon will notify affected users via email before May 20, outlining the available and restricted features for legacy devices. Source


    DeepSeek introduces Expert Mode

    On April 8, DeepSeek launched a new Expert Mode, adding a toggle between “Fast Mode” and “Expert Mode” above the input field. This marks the first time DeepSeek has introduced a tiered mode design in its product.

    According to DeepSeek, Fast Mode is suitable for everyday conversations, offering instant responses and support for text recognition in images and files. Expert Mode is designed for complex problems, supporting deep reasoning and intelligent search. However, the new mode currently does not support file uploads or multimodal features, and DeepSeek notes that users may experience wait times during peak usage. Source


    GitHub launches cross-model AI review feature

    On April 6, GitHub announced in a blog post an experimental feature called Rubber Duck for its Copilot CLI, introducing a cross-model “second opinion” review mechanism that can improve AI performance by nearly 75%.

    The feature adopts a cross-family model strategy: when users select a Claude model as the primary controller, Rubber Duck calls GPT-5.4 for review. Its core function is to audit the agent’s work and generate a high-value checklist, including overlooked details, questionable assumptions, and edge cases.

    According to the post, evaluations based on the SWE-Bench Pro benchmark show that pairing Claude Sonnet 4.6 with Rubber Duck closes 74.7% of the performance gap compared to Claude Opus 4.6 running independently.

    The feature is currently available in experimental mode. Users can enable it by installing GitHub Copilot CLI and running the /experimental command. After activation, selecting a Claude model and enabling access to GPT-5.4 allows users to experience the feature. Source


    SanDisk releases 2TB Extreme Pro UHS-II SD card

    On April 8, SanDisk introduced a 2TB Extreme Pro UHS-II SD card aimed at professional imaging users. The card targets the high-end market, offering sequential read speeds of up to 310 MB/s and write speeds of up to 305 MB/s. It is designed for professionals shooting 8K video or requiring high-resolution continuous shooting, features an IP68 rating, and can withstand drops from up to 6 meters. The product is priced at $1,999.99. Source


    Sony launches Playerbase program

    • On April 8, Sony announced the launch of the Playerbase program, aiming to bring real players directly into the worlds of PlayStation Studios games.
    • The core feature of the program is scanning players’ appearances and integrating their likeness into in-game environments, allowing developers to recreate players’ real-world looks and performances within game scenes. Staff will use multi-camera capture systems to record players from various angles, and the footage will then be processed into high-precision 3D models, accurately reproducing facial details, body proportions, and texture quality. Additional scans or recordings will capture facial motion data, which can be mapped onto rigging systems for character animation and dialogue performance. In some cases, players may also provide motion data, allowing their physical movements to serve as references for in-game animations or background character performances.
    • Players can apply for the program for a chance to be officially scanned and integrated into a game. PlayStation will review applications, shortlist candidates for video interviews, and ultimately select one lucky fan whose likeness will be featured in Gran Turismo 7. Source
  • SSPAI Morning Brief: macOS TCP Stack Vulnerability Discovered, Cyberpunk 2077 Gets PS5 Pro Enhanced Update

    SSPAI Morning Brief: macOS TCP Stack Vulnerability Discovered, Cyberpunk 2077 Gets PS5 Pro Enhanced Update

    Morning Brief

    1. macOS kernel TCP stack has an overflow vulnerability
    2. Google Pixel 10a launches in Japan with an exclusive variant
    3. Minisforum releases Elite M1 Lite mini PC
    4. Motorola unveils Moto G Stylus (2026) and Moto Pad
    5. Anthropic and others launch Project Glasswing cybersecurity initiative
    6. Adobe introduces Student Spaces learning tool
    7. 360doc to shut down service on May 1
    8. Cyberpunk 2077 receives PS5 Pro enhanced update
    9. News Worth a Quick Look

    macOS kernel TCP stack has an overflow vulnerability

    A blog post published by Photon on April 6 revealed a TCP stack vulnerability in macOS that can lead to network failure. The issue is triggered by a 32-bit counter in the Apple XNU kernel. When the system uptime reaches 49 days, 17 hours, 2 minutes, and 47 seconds, the internal TCP timer stops functioning. As a result, TCP connections in certain states cannot be properly recycled, eventually exhausting ephemeral ports and preventing the system from establishing any new TCP connections. Only certain types of connections, such as ICMP, remain unaffected.

    Photon’s testing confirms that this kernel flaw affects at least macOS Catalina 10.15 and later versions, while it is unclear whether earlier versions are impacted. Notably, this issue does not cause kernel crashes or generate obvious error logs, making it a silent failure that is difficult for users to diagnose in a timely manner.

    Interestingly, due to macOS’s system update cycle, Apple typically pushes updates and triggers restarts within the window before the bug occurs. As a result, only devices that run continuously for extended periods without restarting are likely to be affected. Source


    Google Pixel 10a launches in Japan with an exclusive variant

    On April 7, Google announced that the Pixel 10a has officially opened for pre-orders in Japan. The company also partnered with local creative agency HERALBONY to introduce a Japan-exclusive color variant, “Isai Blue.”

    This limited edition is available only in a 256GB configuration, with hardware specifications identical to the standard version. It comes bundled with an exclusive protective case and stickers, and features custom wallpapers designed by HERALBONY artists, along with a system-wide theme package in a unified color style.

    In addition to the limited-edition color, other color options are also available simultaneously. The Google Pixel 10a starts at JPY 79,900 in the Japanese market, with the new “Isai Blue” limited edition set to begin shipping on May 20. Source


    Minisforum releases Elite M1 Lite mini PC

    On April 7, Minisforum announced the Elite M1 Lite-125U mini PC, powered by Intel’s Meteor Lake U series Core Ultra 5 125U processor. The device measures 130×126×50.4mm and weighs 0.6kg. It supports a 35W TDP with phase-change TIM and a turbo fan cooling system, keeping noise levels below 45dB. It features two DDR5-5600 SO-DIMM memory slots and two M.2 2280 PCIe Gen4 SSD slots. Networking includes an Intel I226-V wired NIC, along with Wi-Fi 6E and Bluetooth 5.2 for wireless connectivity. For ports, the front panel offers two USB-A 10Gbps ports and a 3.5mm combo audio jack. The rear includes a full-function USB-C 40Gbps port supporting both 100W PD input and 15W output, two USB-A 480Mbps ports, one HDMI 2.1 TMDS, one DisplayPort 1.4, and a 2.5GbE RJ45 port.

    The device is now available on domestic e-commerce platforms, starting at RMB 2,119. Source


    Motorola unveils Moto G Stylus (2026) and Moto Pad

    On April 6, Motorola officially announced the Moto G Stylus (2026), featuring a stylus, along with the new Moto Pad tablet in the U.S. The Moto G Stylus (2026) upgrades from a basic capacitive stylus to an active electromagnetic pen with tilt detection and pressure sensitivity. It includes a 4mAh battery offering up to 4 hours of continuous use and supports 15-minute fast charging. The stylus also enables quick editing, drag-and-drop split-screen, hover zoom, and Google’s Circle to Search via built-in buttons. In terms of hardware, the device is powered by the Qualcomm Snapdragon 6 Gen 3 processor, with 8GB RAM and 128GB or 256GB UFS 3.1 storage. It features a 6.7-inch 120Hz AMOLED display, a 50MP rear main camera, and a 5200mAh battery supporting 68W wired and 15W wireless fast charging, along with IP68/69 dust and water resistance.

    The Moto Pad, unveiled alongside it, comes with an 11-inch 2K display with a 90Hz refresh rate, powered by the MediaTek Dimensity 6300 chip. It includes a 7040mAh battery with 20W charging support, but does not support stylus input.

    The Moto G Stylus (2026) is priced at $499 and will go on sale on April 16. The Moto Pad will be available via T-Mobile starting April 30, with pricing yet to be announced. Source


    Anthropic and others launch Project Glasswing cybersecurity initiative

    On April 7, Anthropic, together with 11 companies including Amazon, Apple, Google, Microsoft, and NVIDIA, launched the Project Glasswing cybersecurity initiative. The project aims to mitigate the risks of AI being maliciously used to attack global economic and security systems by defensively deploying AI tools. Specifically, it leverages an unreleased general-purpose model, Claude Mythos Preview, to identify and fix vulnerabilities in critical software worldwide. The model has already detected thousands of exploitable vulnerabilities across mainstream operating systems and browsers. Source


    Adobe introduces Student Spaces learning tool

    On April 7, Adobe officially launched Student Spaces, an AI-powered learning suite designed for education. The tool allows users to upload PDFs, documents, presentations, spreadsheets, web links, handwritten notes, and transcripts to automatically generate flashcards, mind maps, quizzes, study guides, and editable presentations powered by Adobe Express. It also integrates the AI podcast feature introduced last month, enabling users to convert study materials into listenable audio content.

    Student Spaces is currently in free testing. Users can access it via a standalone URL without logging in, and it supports both the Acrobat web version and mobile browsers. Source


    360doc to shut down service on May 1

    On April 1, the 360doc Personal Library website announced it will cease operations, stating that it was unable to find a suitable party to take over the service while ensuring data security and continuity. All services will officially shut down on May 1, 2026. Starting immediately, 360doc has suspended new content publishing, but login, data backup, VIP refund requests, and wallet withdrawal functions remain available. Source


    Cyberpunk 2077 receives PS5 Pro enhanced update

    On April 7, CD PROJEKT RED announced via an official blog post a free PS5 Pro enhanced update for Cyberpunk 2077. This update introduces support for PSSR upscaling technology and implements BVH8 architecture optimized for PS5 Pro hardware, improving the accuracy of ray-traced lighting, shadows, and reflections. The game now features three rendering modes: the Ray Tracing Pro mode enables all ray tracing effects, including ambient occlusion, global illumination, and emissive lighting, targeting 40fps on VRR displays (30fps on standard displays); the Performance mode maintains high visual fidelity while using PSSR to boost frame rates up to 90fps (VRR required); and the Ray Tracing mode serves as a balanced option, preserving some ray tracing enhancements while delivering a stable 60fps experience.

    The update is available starting today, and the game has also been added to the PlayStation Plus catalog. Source


    News Worth a Quick Look

    • Some users have discovered that the new Copilot app introduced by Microsoft on Windows 11 is למעשה based on the Microsoft Edge browser. By simply renaming mscopilot.exe to msedge.exe and renaming its folder from Copilot to Edge, users can launch Copilot as Microsoft Edge even if both Microsoft Edge and Edge WebView2 have been uninstalled. Source
    • On April 7, Google announced the introduction of native vertical tabs in the Chrome browser. After updating to the latest stable version, users can right-click the window and select “Move tabs to the side” to enable the feature. This update also includes a revamped Reading Mode, offering a more focused, distraction-free text reading experience. Source
    • On April 6, The New Yorker published a 16,000-word investigative report by Ronan Farrow and Andrew Marantz. Based on interviews with more than 100 sources and private memos from figures such as Ilya Sutskever and Dario Amodei, the report alleges systematic deception by OpenAI CEO Sam Altman, including withholding safety protocol details from the board, forming a secret agreement with Greg Brockman to bypass board oversight, and failing to fulfill commitments to allocate compute resources to the Superalignment team. Source
    • According to a recent analysis cited by Ars Technica from The New York Times, Google’s AI Overviews feature has an error rate of around 10% when answering queries. This means that roughly one in ten responses may contain incorrect information. While the system is accurate about 90% of the time, Google’s massive user base means that even a 10% error rate results in hundreds of thousands of incorrect responses being delivered to users worldwide every minute. Source