Blog

CodablePP

This C++ library provides Codable protocol from Swift.
It may be useful if you have C++ server for iOS or macOS application.
Library methods have very similar syntax to Swift’s version.

Documentation

Before using encoding and decoding methods and Codable class, you must import 2 headers:

#include "Codable.h"
#include "JSON.h"

If you want to make class Codable as you do in Swift, you must choose Codable class as a base class.
For example, if you have class PhoneBook, then its Codable version will look like this:

class PhoneBook: public Codable {};

Also you must provide class with implementations of 2 virtual methods as you do in Swift and 1 initializer without arguments:

void encode(CoderContainer* container) {}
void decode(CoderContainer* container) {}
PhoneBook() {}

By default, there is only one Coder type – JSON. But you have to check for type and perform dynamic_cast. Here is an example for our class PhoneBook:

void encode(CoderContainer* container) {
    if (container->type == CoderType::json) {
        JSONEncodeContainer* jsonContainer = dynamic_cast<JSONEncodeContainer*>(container);
        jsonContainer->encode(contacts, "contacts");
    }
}

After that we can encode everything we want. In the example above we encode contacts of phone book.
The contacts variable has type vector<Contact>. Contact class also must have Codable class as a base class.
Notice that this library supports only vectors for JSON arrays. Simple arrays won’t be eligible for that. Decoding has the same logic:

void decode(CoderContainer* container) {
    if (container->type == CoderType::json) {
        JSONDecodeContainer* jsonContainer = dynamic_cast<JSONDecodeContainer*>(container);
        contacts = jsonContainer->decode(vector<Contact>(), "contacts");
    }
}

We must provide decoder with any variable with type we want to decode. In our case it is vector<Contact>().
The second argument in both encoding and decoding methods is key for field in JSON. It must match with key in client-application in Swift.
So, we made our Phonebook class Codable. Now we can encode it using JSONEncoder as easy as in Swift

JSONEncoder encoder;
auto encodeContainer = encoder.container();
encodeContainer.encode(book);
cout << encodeContainer.content << endl;

and decode it back to Phonebook class

JSONDecoder decoder;
auto container = decoder.container(encodeContainer.content);
auto decodeBook = container.decode(PhoneBook());

`dated`, a `date -d` on steroids

ZSH exoskeleton for the command line “date” utility

Expand the ability of `date` to handle natural language time references (in English), using only the shell built-ins.

The command line utility date from GNU coreutils is familiar to everyone who uses the Linux command line.
When used with the command line flag -d it can parse strings which contain properly formatted datetime constructs and if successful, output the resulting datetime to the standard output, formatted as per additional formatting flags if supplied.

So date -d "next thursday 8:30 pm" will output something like “Thu 09 Feb 2023 08:30:00 PM”

Question: Can date be made a bit more “intelligent” by wrapping it in a command line preprocessor, based only on the built-ins available to the Linux shell (for example bash, or better yet ZSH)?

In other words, can it not explode if the string was instead: "Two years from now, on halloween, at 730 in the morning"?

This repository is an attempt to answer Yes to the above question. The dated command line utility is written in and for zsh and uses zsh built-in machinery (almost exclusivelly) before handing the preprocessed text to date -d. So, if called at the time of this writing,

dated "Two years from now, on halloween, at 730 in the morning"

will respond with Fri 31 Oct 2025 07:30:00 AM EDT ,

or even for the floating Thanksgiving or Easter:

dated "on Thanksgiving day in 2040 at 8 in the evening"

the output will be Thu 22 Nov 2040 08:00:00 PM EST
Please, see the help dated -h or date d --help for more examples.

 SYNOPSIS: dated [--help|-h|] ...<string> [formating options of date]

   - 'dated <string>' Parse <string> to a valid time reference and send to 'date -d'
   - 'dated -h|--help' will print this text
   - any and all additional arguments are passed as-is to 'date -d' to control formating 

The text in <string> is parsed and formated into a valid datetime input for 'date -d'.
It is quite difficult for computers to parse our spoken time references and using only built-in tools
(i.e. date -d from coreutils) presents a huge challenge when parsing arbitrary datetime text.
There are dedicated, complex NLP tools that work better but they are not perfect either.

EXAMPLES:
"Set for Tuesday"       - this is valid.
"for 2023/5/24 at 8pm"   - also OK.
"March the 3rd in the evening."  - is OK
"on March 16 at 7 in the morning in a couple of years" - OK
"on New years eve, 10 years from now" - OK
"two years from now on halloween at 730 in the morning"  - works too
"3rd of March 2024 at 23 hours 13 minutes" - OK
       ( "...for next week"
Also   | "...in 3 hours"
valid: { "...tomorrow morning"  (see source code for "morning" & other adjustable definitions)
       | "...in 33 hours and 5 minutes"
       ( "...January 23 quarter past 7 in the evening
Custom:  "...at the usual time" allows privacy and customization (see code for ideas)
In some edge cases, successful parsing gives incorrect datetime. Some practice needed to avoid those

But WHY?

The problem this was built to solve can be easily explained by looking at Spoken, a set of zsh scripts to record Joplin text notes and to-do’s via speech, from the microphone. The td utility in the repository records audio from the microphone, transcribes it to text using whisper.cpp, a derivative of Open AI’s Wisper and then parses the transcribed, free-form text for a datetime reference so that it can set an automatic notification alarm for the Joplin to-do task. This is performed by using the code functionality of dated.

RagHack – GenAI Fitness Advisor App

Problem Definition:

Personalized Fitness Guidance: MyFitnessBuddy is a GenAI Fitness Advisor App that provides customized workout routines, diet plans, and a food calorie calculator, addressing the limitations of generic fitness apps.
Advanced Retrieval-Augmented Generation: It leverages a hybrid approach combining Retrieval-Augmented Generation (RAG) and Graph Retrieval-Augmented Generation (GRAG) to deliver accurate and context-aware responses to user queries.
Showcasing Innovation at RAGHack: Developed for the RAGHack hackathon, MyFitnessBuddy demonstrates the power of RAG technologies in creating engaging and effective AI-driven fitness solutions using Azure AI and popular frameworks.

Architecture and Implementation:

Architecture Overview:

Fig.1 Architecture

MyFitnessBuddy uses a hybrid architecture combining Retrieval-Augmented Generation (RAG) and Graph Retrieval-Augmented Generation (GRAG). Data is extracted using a Python script and ingested into Azure Blob Storage for structured data and Azure Cosmos DB (Gremlin API) for unstructured data. Azure AI Search indexes the structured data, while the graph database manages complex relationships in the unstructured data. The application utilizes Azure AI Studio and Prompt Flow to define chat logic and connect data sources. User queries are processed by the app server, retrieving relevant information from Azure AI Search and Cosmos DB, which is then sent to Azure OpenAI Services (ChatGPT) to generate personalized responses. This hybrid approach ensures accurate, context-aware, and personalized fitness guidance for users.

Implementation Overview:

Data Extraction and Ingestion:

Fig 2. Data Extraction Architecture

The process begins with a Python script that extracts structured and unstructured data from various sources. This data is then ingested into two different storage systems:
- Azure Blob Storage: Used for structured data, which is chunked and indexed.
- Azure Cosmos DB (Gremlin API): Used for unstructured data, ingested as GraphDoc to enable graph-based retrieval.

Hybrid RAG Approach:

Fig 3. Hybrid RAG Architecture

RAG (Retrieval-Augmented Generation):
- The structured data ingested into Azure Blob Storage is connected to Azure AI Search for indexing and retrieval.
- Azure AI Studio facilitates the chunking and indexing of data, defining chat logic, and generating endpoints using Azure Prompt Flow.
- When a user query is received, Azure AI Search retrieves relevant information from the indexed data.

Graph RAG (Graph Retrieval-Augmented Generation):
- Azure Cosmos DB stores the unstructured data in a graph format using the Gremlin API. This approach allows the application to understand complex relationships between entities such as food items, exercises, and user health metrics.
- The Graph RAG retrieves contextually relevant knowledge from Azure Cosmos DB, which is then combined with structured data for enhanced response generation.

Fig 4. Example of how Unstructured Data is stored as Graph in Azure CosmoDB(Gremlin API)

Azure AI Studio:

Fig 5. Azure AI Studio Architecture

Prompt Flow

We deployed two endpoints using Azure Prompt Flow. One is a rewrite intent endpoint, and the other is a My Fitness Buddy. These endpoints are designed to solve two different use cases: one focuses on optimizing document retrieval through query generation, while the other offers personalized fitness advice within predefined safe boundaries with the knowledge base of the RAG.

1. Rewrite Intent Endpoint

Purpose: This endpoint was designed to handle a specific task: generating search queries based on a user’s question and previous conversation history. By combining the “current user question” and prior context, the endpoint generates a single canonical query that includes all necessary details, without variants. This is employed for document retrieval systems, where generating these precise queries and intent leading to more accurate results.

Fig 6. Flow of Rewrite Intent endpoint

2. My Fitness Buddy endpoint

Purpose: The second endpoint is a My Fitness Buddy that offers personalized fitness advice, workout plans, and nutrition tips based on user input. The assistant is programmed to avoid medical advice and stick solely to the provided dataset to ensure that all recommendations are safe, motivational, and evidence-based and the knowledge base is retreived for the chuncks of documents configured as search indexes.

Fig 7. Flow of My Fitness Buddy endpoint

Application Flow:

The user interacts with the MyFitnessBuddy app through a Python Streamlit-based chatbot interface.
The application server processes the user’s query and directs it to the appropriate retrieval system (Azure AI Search for structured data or Azure Cosmos DB for unstructured data) based on the query type.
Relevant information is retrieved from the selected data source and sent to Azure OpenAI Services (ChatGPT) along with a crafted prompt to generate a personalized response.
The final response, enriched with contextually relevant information, is returned to the user via the Streamlit app, providing tailored fitness advice and recommendations.

Fig 8. Application

Fig 9. Testing tool for endpoints

Technologies Used:

Data Storage and Retrieval: Azure Blob Storage, Azure Cosmos DB (Gremlin API), Azure AI Search.
AI and Language Models: Azure OpenAI Services (ChatGPT).
Data Processing and Logic Flow: Azure AI Studio, Azure Prompt Flow.
Backend and Application Server: Python for data extraction and preprocessing, with multiple integration points for data ingestion and retrieval.

Target Audience:

Fitness Enthusiasts: Individuals who are passionate about fitness and are looking for personalized workout routines and diet plans to optimize their fitness journey.
Health-Conscious Individuals: People who prioritize a healthy lifestyle and want easy access to accurate nutritional information, calorie tracking, and tailored dietary advice.
Beginners in Fitness: Newcomers who need guidance on starting their fitness journey, including basic workout routines, dietary recommendations, and answers to common fitness-related questions.
Busy Professionals: Users with limited time for fitness planning who seek convenient, on-demand access to customized fitness guidance and quick answers to health-related queries.
Individuals with Specific Health Goals: Those with unique fitness goals or health conditions who require personalized plans and advice that consider their specific needs and preferences.

Conclusion and Future Works:

Conclusion

MyFitnessBuddy demonstrates the potential of combining advanced AI techniques like Retrieval-Augmented Generation (RAG) and Graph Retrieval-Augmented Generation (GRAG) to create a highly personalized and context-aware fitness advisor. By leveraging Azure AI’s capabilities and integrating multiple data sources, the app provides customized workout routines, dietary plans, and accurate responses to user queries. This approach enhances user engagement and satisfaction by delivering tailored and relevant fitness guidance.

Future Work

Enhanced Personalization: Further refine the models to provide more granular customization based on user feedback, behavior, and preferences.
Multilingual Support: Implement multilingual capabilities to reach a broader audience globally.
Advanced Analytics: Develop advanced analytics features to provide users with deeper insights into their fitness progress, habits, and trends.
Expanded Data Sources: Incorporate additional data sources such as medical databases and user-generated content to enhance the app’s knowledge base and improve recommendation accuracy.

spring-boot-maven-generate-code-coverage-report-jacoco

Purpose : Generate coverage report for spring boot application Controller Service and Util class methods.
Reason : Increase code coverage ratio.

Local run steps

1- Add jacoco-maven-plugin into pom.xml .
2- Generate coverage report index.html file under target/jacoco-report directory.
3- Before starting the application run mvn clean install to generate mapstruct mapper class.
4- If generated mapstruct mapper class is not recognized by IntelliJ IDE then reload all maven projects.
5- Start Spring Boot REST API by running main method containing class CustomerInfoApplication.java in your IDE.
6- Alternatively you can start your Docker container by following the commands below.
NOT : Execute maven command from where the pom.xml is located in the project directory to create Spring Boot executable jar.

 
$ mvn clean install -U -X 

$ mvn spring-boot:run

Generated code coverage report file is placed under “target/jacoco-report/” directory

Code coverage report can be accessed via browser.

swagger_ui can be accessed via https secure port 8443 from localhost :
https://localhost:8443/customer-info/swagger-ui/index.html

Database ER diagram :

Tech Stack

Java 11
H2 Database Engine
spring boot
spring boot starter data jpa
spring boot starter web
spring boot starter test
spring boot starter aop
spring boot starter actuator
spring security web
springdoc openapi ui
springfox swagger ui
querydsl-jpa
querydsl-apt
hibernate
logback
mapstruct
mapstruct-processor
hikari connection pool
mockito-core
mockito-junit-jupiter
mockito-inline
Docker
maven
maven-surefire-plugin
maven-failsafe-plugin
jacoco-maven-plugin

Docker build run steps

NOT : Execute docker commands from where the DockerFile is located.
NOT : Tested on Windows 10 with Docker Desktop Engine Version : 20.10.11

$ docker system prune -a --volumes 

$ docker build . --tag demo  

$ docker images 

  REPOSITORY   TAG       IMAGE ID       CREATED         SIZE 

  demo         latest    9d4a0ec3294e   6 minutes ago   288MB 

$ docker run -p 8443:8443 -e "SPRING_PROFILES_ACTIVE=dev" demo:latest

API OPERATIONS

Save a new customer to database

Method : HTTP.POST
URL : https://localhost:8443/customer-info/customer/save
HTTP Request Body :

{
    "name": "name1",
    "age": 1,
    "shippingAddress": {
        "address": {
            "streetName": "software",
            "city": "ankara",
            "country": "TR"
        }
    }
}

Curl Request :

curl --location --request POST 'https://localhost:8443/customer-info/customer/save' \
--header 'Content-Type: application/json' \
--data-raw '{
    "name": "name1",
    "age": 1,
    "shippingAddress": {
        "address": {
            "streetName": "software",
            "city": "ankara",
            "country": "TR"
        }
    }
}'

Response :

HTTP response code 200

{
    "id": 1,
    "name": "name1",
    "age": 1,
    "shippingAddress": {
        "id": 1,
        "address": {
            "id": 1,
            "streetName": "software",
            "city": "ankara",
            "country": "TR"
        }
    }
}

HTTP Response Headers :

request-id: 68182bbf-996d-4732-a6ff-2c49a90012d1
correlation-id: 68182bbf-996d-4732-a6ff-2c49a90012d1
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers

List all customers saved to database

Method : HTTP.GET
URL : https://localhost:8443/customer-info/customer/list
Request Body :

{}

Curl Request :

curl --location --request GET 'https://localhost:8443/customer-info/customer/list' \
--header 'Content-Type: application/json' \
--header 'Cookie: JSESSIONID=5E6B21C9533643F4A7EE462DCBB3B312' \
--data-raw '{}'

Response :

HTTP response code 200

[
    {
        "id": 1,
        "name": "name1",
        "age": 1,
        "shippingAddress": {
            "id": 1,
            "address": {
                "id": 1,
                "streetName": "software",
                "city": "ankara",
                "country": "TR"
            }
        }
    }
]

HTTP Response Headers :

request-id: 411b4b33-6af5-4f78-b185-4171e779222d
correlation-id: 411b4b33-6af5-4f78-b185-4171e779222d
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers

Dompet

Dompet is a personal bookkeeping web application, built with Laravel 5.

Objective

Easy bookkeeping for personal income and spending (amount of money).

Concept

To aquire our objective, we need this features on the application:

User can register.
User can see transaction history by date of their own.
User add transactions for income and spending.
User can categorize the transaction.
User can see transaction summary on each month or a year.
- sign on numbers indicates spending/outcome transaction.

Those are basic concept of the personal bookkeeping service.

Getting Started

This application can be installed on local server and online server with these specifications :

Server Requirements

PHP >=7.3 (and meet Laravel 8.x server requirements),
MySQL or MariaDB database,
SQlite (for automated testing).

Installation Steps

Clone the repo : git clone https://github.com/nafiesl/dompet.git
$ cd dompet
$ composer install
$ cp .env.example .env
$ php artisan key:generate
Create new MySQL database for this application
(with simple command: $ mysqladmin -urootuser -p create dompet_db)
Set database credentials on .env file
$ php artisan migrate
$ php artisan serve
Register new account.

Laravel Passport Setup for REST API

Run php artisan passport:install, we will get this output.

Encryption keys generated successfully.
Personal access client created successfully.
Client ID: 1
Client secret: XQIkbQxxxxxasQHt9YU
Password grant client created successfully.
Client ID: 2
Client secret: IWzI2DvXDxxxxb1kIH

Set the Password grant client on our .env.

PASSPORT_CLIENT_ID=2
PASSPORT_CLIENT_SECRET=IWzI2DvXDxxxxb1kIH

Run the web server php artisan serve.

Starting Laravel development server: http://127.0.0.1:8000

Login to the API endpoint using our REST API client (like Postman).

curl -X POST \
  http://127.0.0.1:8000/api/auth/login \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
    "email": "admin@example.net",
    "password": "password"
}'

We will get this response:

{
    "token_type": "Bearer",
    "expires_in": 31536000,
    "access_token": "eyJ0eXAiOxxxx2XJEbA",
    "refresh_token": "def5020xxxx58b21"
}

Use the access_token value as the bearer authorization header.

curl -X GET \
  http://127.0.0.1:8000/api/transactions?year=2018&month=09 \
  -H 'Accept: application/json' \
  -H 'Authorization: Bearer eyJ0eXAiOxxxx2XJEbA'

Response:

{
  "data": [],
  "stats": {
    "start_balance": "0.00",
    "income_total": "0.00",
    "spending_total": "0.00",
    "difference": "0.00",
    "end_balance": "0.00"
  }
}

Contributing

If you are considering to contribute to this repo. It is really nice.

Submit Issue if you found any errors or bugs or if you want to propose new feature.
Pull Request for bugfix, typo on labels, or new feature.

Screenshots

Transaction List on a Month

Transaction List on Selected Date

Transaction Summary Report

License

Dompet project is a free and open-sourced software under MIT License.

Large Dataset Import Microservice

This repo contains two minimum viable products that will import a 6 million record .csv file into PostgreSQL.
The first method I created to achieve this uses Stateless Sessions to stringify the data and loop through the data file, while the second method uses Spring Batch processing.

Average runtime for the batch processor with a ThreadPoolTaskExecutor is 2 minutes 33 seconds. Average runtime for the stateless sessions parser/processor is 40 minutes.
Both of these methods will be improved upon in the future by incorporating a MultiResourcePartitioner within the Spring Batch Configuration file, as well as splitting the large dataset into smaller sets, so that multiple threads may operate on different files at a given time.

This project:

Uses Spring Boot service uses Spring Batch with Spring Data JPA-Hibernate.

Imports data from a CSV file (about 6 million records) to a PostgreSQL database.

Improved batch processing performance from implementing a ThreadPoolTaskExecutor to achieve data chunking and multithreaded code.

Based on this data, a fraud detection model is built using python machine learning libraries.

Is intended to be launched through an API Gateway server (linked below).

Instructions to run:

1. Clone this repository to your local machine.

Kaggle

3. Within main/java/com there are two distinct packages, “batch” and “session”, which are the batch processor and sessions processor respectively.

4. Each package has it’s own main file that can be ran

5. Once the application is launched without issues, head over to Postman and test on your configured port and the route “/load”

Technologies Used

Java

Spring Boot for REST API

Spring Batch Processing (Open Source Data Processing Framework)

Maven

Factory Design Pattern within Batch Processor

Hibernate

Java Persistence API (JPA)

PostgreSQL

Gateway Server Communication. Gateway Server can be found here.

bl4st

Livecodable real-time fractal flames in the browser.

Flam3’s are a type of Iterated Function System that
generate fractals that can look similar to flames.

This is a very computationly expensive operation that can be tricky to
parallelize. Thanks to Orion Sky Lawlor
and Juergen Wothke we can build and run flames in
real-time in the browser.

The goal is to provide a framework to for accelerate exploration of and performance
with fractal flames.

Guide

Flames configs can be built by calling the flame function, and setting various properties
of the flame:

flame()
    .colorful(.4)
    .exposure(3)

Flames are essentially a set of transforms applied recursively to an initial state. These
transforms apply some affine transformation on the coordinate system using the x, y, and o
vector properties and a wvar scalar. The result of this affine transformation is fed into
the a function based on the variation, which further alters the coordinate system.

For example, if you wanted to add a linear (just the affine part) transform to a flame you could do:

flame()
  .colorful(.4)
  .exposure(3)
  .addTransform(
    transform()
    .linear()
    .x([1.05,0])
    .y([0,1.05])
    .o([0, 0])
  )

This is essentially taking the initial state (just a square), stretching it out, and mapping the
output intensity to some color.

Things get more intersting if we start stacking transforms and playing with the o vector:

flame()
  .colorful(.4)
  .exposure(3)
  .addTransform(
    transform()
    .linear()
    .x([1,0])
    .y([0,1])
    .o([0.1, 0.1])
  )
  .addTransform(
    transform()
    .linear()
    .x([0, -1])
    .y([0.2, 0])
    .o([0, 0])
  )

Check out the full list of supported variations

Transforms can also be animated over time by providing any of the properties a function:

flame()
  .colorful(.4)
  .exposure(3)
  .addTransform(
    transform()
    .linear()
    .x([1,0])
    .y([0,1])
    .o([0.1, 0.1])
  )
  .addTransform(
    transform()
    .linear()
    .x(({time}) => [0, Math.sin(time)])
    .y(({time}) => [Math.cos(time), 0])
    .o([0, 0])
  )

For convenience, you can also make any of the vectors rotate over time by supplying
angle, speed, and radius (which can also be functions).

flame()
  .colorful(.4)
  .exposure(3)
  .addTransform(
    transform()
    .linear()
    .x([1,0])
    .y([0,1])
    .rotateO(1, 1, ({time})=>Math.abs(Math.sin(time)))
  )
  .addTransform(
    transform()
    .linear()
    .x([0, -1])
    .y([0.2, 0])
  )

Variations

The following transform variations are supported:

linear
sinusoidal
spherical
swirl
horseshoe
polar
handkerchief
heart
disc
spiral
hyperbolic
diamond
ex
julia
bent
fisheye
exponential
power
cosine

This is only a subset of the standard flam3 variations
due to limitations around running on the GPU. If you can come up with the inverse for
any of these variation functions, please submit a PR!

More Examples

flame()
  .screenInitScale(.2)
  .screenInitVal(.8)
  .colorful(0.4)
  .mapExposure(1.6)
  .addTransform(
    transform()
    .linear()
    .weight(.8)
    .o(({time}) => [Math.sin(time/5), Math.sin(time/3)])
    .build()
  )
  .addTransform(
    transform()
    .weight(.1)
    .fisheye()
    .x([.1,8])
    .y([4,.1])
    .y([7,.1])
    .build()
  )
  .iterations(4)
  .firstLevel(7)
  .lastLevel(12)

Using bl4st in hydra

await import("https://emptyfla.sh/bl4st/bundle-global.js")

flameEngine.setConfig(
	flame()
	.colorful(.7)
	.mapExposure(2)
	.addTransform(
		transform()
		.hyperbolic()
		.rotateX()
		.build()
	)
	.addTransform(
		transform()
		.fisheye()
		.rotateY()
		.build()
	)
	.addTransform(
		transform()
		.fisheye()
		.rotateO()
		.build()
	)
)

flameEngine.start()

s0.init({
	src: flameEngine.canvas
})

src(o0)
	.layer(
		src(s0)
		.luma())
	.scale(1.002)
	.modulateRotate(noise(1), .01)
	.out()

Device Functional Role ID via Machine Learning and Network Traffic Analysis

Overview

NetworkML is the machine learning portion of our Poseidon project. The model in networkML classifies each device into a functional role via machine learning models trained on features derived from network traffic. “Functional role” refers to the authorized administrative purpose of the device on the network and includes roles such as printer, mail server, and others typically found in an IT environment. Our internal analysis suggests networkML can achieve accuracy, precision, recall, and F1 scores in the high 90’s when trained on devices from your own network. Whether this performance can transfer from IT environment to IT environment is an active area of our research.

NetworkML can be used in a “standalone” mode from the command line interface. For more background and context on the macro project, please check out the Poseidon project page on our website. This repository specifically covers the output, inputs, data processing, and machine learning models we deploy in networkML.

While this repository and resulting docker container can be used completely independently, the code was written to support the IQT Labs Poseidon project. See:

Poseidon SDN project.

This repository contains the components necessary to build a docker container that can be used for training a number of ML models using network packet captures (PCAPs). The repository includes scripts necessary to do training, testing, and evaluation. These can be run from a shell once networkml is installed as a package or run in a Docker container using the networkml script.

Feel free to use, discuss, and contribute!

Model Output

NetworkML predicts the functional role of network-connected device via network traffic analysis and machine learning.

Admittedly subjective, the term “role” refers to the authorized administrative purpose of the device on the network. NetworkML in its default configuration has twelve roles: active directory controller, administrator server, administrator workstation, confluence server, developer workstation, distributed file share, exchange server, graphics processing unit (GPU) laptop, github server, public key infrastructure (PKI) server, and printer. This typology reflects the network-connected devices in the data we used to train the model. Other networks will lack some of these roles and might include others. Consequently, organizations that wish to use networkML might have to adapt the model outputs for their specific organization.

Model Inputs

NetworkML’s key input is the network traffic for a single device. By network traffic for a single device, we mean all packets sent and received by that device over a given time period. For reliable results, we recommend at least fifteen minutes of network traffic. Poseidon, the larger project of which networkML is only a part, performs the necessary packet pre-processing to produce pcap’s containing all network traffic to and from a single device. If you are using networkML in a standalone manner, the pcap files must all follow a strict naming convention: DeviceName-deviceID-time-duration-flags.pcap. For example, ActiveDirectoryController-labs-Fri0036-n00.pcap refers to a pcap from an active directory controller taken from a user named labs on a Friday at 00:36. The flag field does not currently have any significance.

It is worth noting that networkML uses only packet header data in its models. NetworkML does not use data from the packet payload. Relying only on packet header data enables networkML to avoid some privacy-related issues associated with using payload data and to create (hopefully) more generalizable and more performant models.

Data Processing

Algorithms

NetworkML uses a feedforward neural network from the scikit-learn package. The model is trained using 5-fold cross validation in combination with a simple grid-search of the hyper-parameter space.

Installation/Run

Our models can be executed via Docker and in a standalone manner on a Linux host. We recommend deployment via Poseidon if you are running an SDN (software-defined network). Otherwise, we recommend using Docker.

See the README file included in the networkml/trained_models folder for specific instructions on deployment.

Develop/Standalone Installation

Note: This project uses absolute paths for imports, meaning you’ll either need to modify your PYTHONPATH to something like this from the project directory:

export PYTHONPATH=$PWD/networkml:$PYTHONPATH

Alternatively, simply running pip3 install . from the project directory after making changes will update the package to test or debug against.

This package is set up for anaconda/miniconda to be used for package and environment management if desired. Assuming you have the latest install (as of this writing, we have been using conda 4.5.12), set up the environment by performing the following:

Ensure that the CONDA_EXE environment variable has been set. If echo $CONDA_EXE returns empty, resolve this by export CONDA_EXE=$_CONDA_EXE in your bash shell.
Run make dev to set up the environment
Run conda activate posml-dev to begin.

You can remove the dev environment via standard conda commands:

Run conda deactivate
Run conda env remove -y -n posml-dev

For more information about using conda, please refer to their user documentation.

Device Functional Role ID via Machine Learning and Network Traffic Analysis

Overview

NetworkML is the machine learning portion of our Poseidon project. The model in networkML classifies each device into a functional role via machine learning models trained on features derived from network traffic. “Functional role” refers to the authorized administrative purpose of the device on the network and includes roles such as printer, mail server, and others typically found in an IT environment. Our internal analysis suggests networkML can achieve accuracy, precision, recall, and F1 scores in the high 90’s when trained on devices from your own network. Whether this performance can transfer from IT environment to IT environment is an active area of our research.

NetworkML can be used in a “standalone” mode from the command line interface. For more background and context on the macro project, please check out the Poseidon project page on our website. This repository specifically covers the output, inputs, data processing, and machine learning models we deploy in networkML.

While this repository and resulting docker container can be used completely independently, the code was written to support the IQT Labs Poseidon project. See:

Poseidon SDN project.

This repository contains the components necessary to build a docker container that can be used for training a number of ML models using network packet captures (PCAPs). The repository includes scripts necessary to do training, testing, and evaluation. These can be run from a shell once networkml is installed as a package or run in a Docker container using the networkml script.

Feel free to use, discuss, and contribute!

Model Output

NetworkML predicts the functional role of network-connected device via network traffic analysis and machine learning.

Admittedly subjective, the term “role” refers to the authorized administrative purpose of the device on the network. NetworkML in its default configuration has twelve roles: active directory controller, administrator server, administrator workstation, confluence server, developer workstation, distributed file share, exchange server, graphics processing unit (GPU) laptop, github server, public key infrastructure (PKI) server, and printer. This typology reflects the network-connected devices in the data we used to train the model. Other networks will lack some of these roles and might include others. Consequently, organizations that wish to use networkML might have to adapt the model outputs for their specific organization.

Model Inputs

NetworkML’s key input is the network traffic for a single device. By network traffic for a single device, we mean all packets sent and received by that device over a given time period. For reliable results, we recommend at least fifteen minutes of network traffic. Poseidon, the larger project of which networkML is only a part, performs the necessary packet pre-processing to produce pcap’s containing all network traffic to and from a single device. If you are using networkML in a standalone manner, the pcap files must all follow a strict naming convention: DeviceName-deviceID-time-duration-flags.pcap. For example, ActiveDirectoryController-labs-Fri0036-n00.pcap refers to a pcap from an active directory controller taken from a user named labs on a Friday at 00:36. The flag field does not currently have any significance.

It is worth noting that networkML uses only packet header data in its models. NetworkML does not use data from the packet payload. Relying only on packet header data enables networkML to avoid some privacy-related issues associated with using payload data and to create (hopefully) more generalizable and more performant models.

Data Processing

Algorithms

NetworkML uses a feedforward neural network from the scikit-learn package. The model is trained using 5-fold cross validation in combination with a simple grid-search of the hyper-parameter space.

Installation/Run

Our models can be executed via Docker and in a standalone manner on a Linux host. We recommend deployment via Poseidon if you are running an SDN (software-defined network). Otherwise, we recommend using Docker.

See the README file included in the networkml/trained_models folder for specific instructions on deployment.

Develop/Standalone Installation

Note: This project uses absolute paths for imports, meaning you’ll either need to modify your PYTHONPATH to something like this from the project directory:

export PYTHONPATH=$PWD/networkml:$PYTHONPATH

Alternatively, simply running pip3 install . from the project directory after making changes will update the package to test or debug against.

This package is set up for anaconda/miniconda to be used for package and environment management if desired. Assuming you have the latest install (as of this writing, we have been using conda 4.5.12), set up the environment by performing the following:

Ensure that the CONDA_EXE environment variable has been set. If echo $CONDA_EXE returns empty, resolve this by export CONDA_EXE=$_CONDA_EXE in your bash shell.
Run make dev to set up the environment
Run conda activate posml-dev to begin.

You can remove the dev environment via standard conda commands:

Run conda deactivate
Run conda env remove -y -n posml-dev

For more information about using conda, please refer to their user documentation.

vim-eft

This plugin provides f/t/F/T mappings that can be customized by your setting.

Usage

You can enable this plugin via following mappings.

  nmap ; <Plug>(eft-repeat)
  xmap ; <Plug>(eft-repeat)
  omap ; <Plug>(eft-repeat)

  nmap f <Plug>(eft-f)
  xmap f <Plug>(eft-f)
  omap f <Plug>(eft-f)
  nmap F <Plug>(eft-F)
  xmap F <Plug>(eft-F)
  omap F <Plug>(eft-F)
  
  nmap t <Plug>(eft-t)
  xmap t <Plug>(eft-t)
  omap t <Plug>(eft-t)
  nmap T <Plug>(eft-T)
  xmap T <Plug>(eft-T)
  omap T <Plug>(eft-T)

Configuration

highlight

" Disable highlight
let g:eft_highlight = {}

" Custom highlight
let g:eft_highlight = {
\   '1': {
\     'highlight': 'EftChar',
\     'allow_space': v:true,
\     'allow_operator': v:true,
\   },
\   '2': {
\     'highlight': 'EftSubChar',
\     'allow_space': v:false,
\     'allow_operator': v:false,
\   },
\   'n': {
\     'highlight': 'EftSubChar',
\     'allow_space': v:false,
\     'allow_operator': v:false,
\   }
\ }

character matching

" You can pick your favorite strategies.
let g:eft_index_function = {
\   'head': function('eft#index#head'),
\   'tail': function('eft#index#tail'),
\   'space': function('eft#index#space'),
\   'camel': function('eft#index#camel'),
\   'symbol': function('eft#index#symbol'),
\ }

" You can use the below function like native `f`
let g:eft_index_function = {
\   'all': { -> v:true },
\ }

DEMO

NOTE: This demo uses Ff, fm fc with this plugin’s default configuration.

Blog

CodablePP

Documentation

dated, a date -d on steroids

ZSH exoskeleton for the command line “date” utility

Expand the ability of date to handle natural language time references (in English), using only the shell built-ins.

But WHY?

RagHack – GenAI Fitness Advisor App

Problem Definition:

Architecture and Implementation:

Architecture Overview:

Implementation Overview:

Data Extraction and Ingestion:

Hybrid RAG Approach:

Azure AI Studio:

Prompt Flow

1. Rewrite Intent Endpoint

2. My Fitness Buddy endpoint

Application Flow:

Technologies Used:

Target Audience:

Conclusion and Future Works:

Conclusion

Future Work

spring-boot-maven-generate-code-coverage-report-jacoco

Local run steps

Tech Stack

Docker build run steps

API OPERATIONS

Save a new customer to database

List all customers saved to database

Dompet

Objective

Concept

Getting Started

Server Requirements

Installation Steps

Laravel Passport Setup for REST API

Contributing

Screenshots

Transaction List on a Month

Transaction List on Selected Date

Transaction Summary Report

License

Large Dataset Import Microservice

This repo contains two minimum viable products that will import a 6 million record .csv file into PostgreSQL. The first method I created to achieve this uses Stateless Sessions to stringify the data and loop through the data file, while the second method uses Spring Batch processing.

Instructions to run:

Technologies Used

bl4st

Guide

Variations

More Examples

Using bl4st in hydra

Device Functional Role ID via Machine Learning and Network Traffic Analysis

Overview

Model Output

Model Inputs

Data Processing

Algorithms

Installation/Run

Develop/Standalone Installation

Device Functional Role ID via Machine Learning and Network Traffic Analysis

Overview

Model Output

Model Inputs

Data Processing

Algorithms

Installation/Run

Develop/Standalone Installation

vim-eft

Usage

Configuration

highlight

character matching

DEMO

`dated`, a `date -d` on steroids

Expand the ability of `date` to handle natural language time references (in English), using only the shell built-ins.

This repo contains two minimum viable products that will import a 6 million record .csv file into PostgreSQL.
The first method I created to achieve this uses Stateless Sessions to stringify the data and loop through the data file, while the second method uses Spring Batch processing.