Prologue
This is an application proposal aimed to integrate Windows apps usage information to a data union product on Streamr marketplace. The proposal includes plan, motivation, plan, data buyer, privacy and authors.
Motivation
There are two big reasons for us to consider crafting this proposal. First, there are too many antivirus software out there in the world, collecting user data such as the processes, activities, suspicious files and things like that, contributing to several problems:
- Each antivirus company can only access information pertaining to the user who has installed their specific application. If a user uninstalls that antivirus, there will be no more information to share.
- Small companies which do not have many customers have small amounts of data being shared, as they have fewer users compared to the more popular ones (e.g., Kaspersky).
Second, the world has changed towards targeted advertising focused on the specific traits, interests, and preferences of a consumer. For example Google takes many information from users via the browser or Google search engine, such as location, history, search keywords, etc. However, browsers are not able to capture operating system information, such as the process names, creators and used resources.
There is a need for an application standing in the middle, taking user’s preferred information to the Streamr marketplace to crowd-sell it to the companies seeking such data.
Plan
An application called "??? (we will find a name later)" will be developed (some key parts have already been developed) to collect other application and network usage information with user permission. This application will be compatible and working with the Windows family operating systems. Some data which will be collected in the first version are (will be improved by experience and feedbacks):
General capabilities:
- General informations
- - Installed softwares
- - Windows hotfixes
- - Startup programs
- - Browers’s add-ons
- - System usage (ram, cpu, net, disk, etc)
- - Operating system info
- - Hardware info
- - Running processes
- - Installed services
- - Open ports
- - Scheduled tasks
- Other information
- - Process creation and termination with process command line and relate to parent process
- - Information about loaded images in all processes
- - Record hash of process and dll images
- - Driver loading logs
- - Network connection logs for tcp and udp protocols related to process guid
- - Logs file time change in files (change in last modified time etc)
- - Logs dns requests alongside response IP addresses
- - All registry events (read, write and rename)
- - NTFS data stream change events
- - File delete events alongside related processes
- - Create and connection to named pipes
- - WMI event filter creation and deletion events

The raw data will be parsed and various data sets will be extracted. AS instance, the data set containing following parameters is good for advertising companies:
- Process
- - Name
- - Duration
- - Creator
- - Resources used
- Network connections
- - DNS
- - TCP/UDP
- Services
- Installed programs
In addition, various data sets will be extracted which are suitable for the security researchers.
Privacy
As the application collects sensitive information and has huge access to user's activity and data, privacy will become a big issue. The reasonable solution is the source of the application which will be accessible. As the application is open source, the users can trust the data filters and terms of use.
In terms of modes of operation of the application, the application will have three working modes:
- Stopped mode: do nothing, only stands by for schedules
- Offline mode: only collects the data and analyze it
- Online mode: collects the data, analyzing and sending it
The application will warn the users about the data being collected and sent.
In terms of data collection, there will be three mode available:
- Snapshot mode:
- Users will install the application, configure it to work offline and analyze the system. The users will take a date, the application will send the data changes after the date specified. The data will be collected by the system changes and differences. The application never sends the old data collected and analyzed before the dates which users specified.
- This data collected by this mode is very useful for the security researchers, antivirus companies etc, and the users will not be concerned about their privacy as their installed and sensitive information entered before will not be sent to the marketplace.
- filtered data mode:
- The users can select which data should be sent and which not. There will be various filters on the topics, content, time and etc. This mode is suitable for the professional users.
- Easy Filtered data mode:
- This mode is the same as previous, but there will be some templates included in the application. So the user can select the appropriate template. This mode is suitable for ordinary users.
In terms of data, the user can see the copy of data sent to the marketplace. Also the users can set pre agreement before the application sends the data (user will see the data, does filter and confirms the data to send)
Data Buyers
The data buyers are everyone or every company interested in the information the application collects. Some might be:
Antiviruses companies
- Security researchers (such as CERT centers in the universities)
- Advertisements companies
- Security companies (such as companies producing security devices and looking for the information to hunt new malwares across the internet)
Authors
Pouyan and Alireza planned to develop an application that integrates with Streamr. Pouyan is a product manager, currently working as a freelancer. Alireza is a great software engineer, having worked on various scalable projects. We both are cooperating with https://zdresearch.com. We are a good team matched together to build this application and develop it.