tensorflow-neural-networks

TensorFlow Neural Networks

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tensorflow-neural-networks" with this command: npx skills add thebushidocollective/han/thebushidocollective-han-tensorflow-neural-networks

TensorFlow Neural Networks

Build and train neural networks using TensorFlow's high-level Keras API and low-level custom implementations. This skill covers everything from simple sequential models to complex custom architectures with multiple outputs, custom layers, and advanced training techniques.

Sequential Models with Keras

The Sequential API provides the simplest way to build neural networks by stacking layers linearly.

Basic Image Classification

import tensorflow as tf from tensorflow import keras import numpy as np

Load MNIST dataset

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

Preprocess data

x_train = x_train.astype('float32') / 255.0 x_test = x_test.astype('float32') / 255.0 x_train = x_train.reshape(-1, 28 * 28) x_test = x_test.reshape(-1, 28 * 28)

Build Sequential model

model = keras.Sequential([ keras.layers.Dense(128, activation='relu', input_shape=(784,)), keras.layers.Dropout(0.2), keras.layers.Dense(64, activation='relu'), keras.layers.Dropout(0.2), keras.layers.Dense(10, activation='softmax') ])

Compile model

model.compile( optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'] )

Display model architecture

model.summary()

Train model

history = model.fit( x_train, y_train, batch_size=32, epochs=5, validation_split=0.2, verbose=1 )

Evaluate model

test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0) print(f"Test accuracy: {test_accuracy:.4f}")

Make predictions

predictions = model.predict(x_test[:5]) predicted_classes = np.argmax(predictions, axis=1) print(f"Predicted classes: {predicted_classes}") print(f"True classes: {y_test[:5]}")

Save model

model.save('mnist_model.h5')

Load model

loaded_model = keras.models.load_model('mnist_model.h5')

Convolutional Neural Network

def create_cnn_model(input_shape=(224, 224, 3), num_classes=1000): """Create CNN model for image classification.""" model = tf.keras.Sequential([ # Block 1 tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=input_shape), tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same'), tf.keras.layers.MaxPooling2D((2, 2)), tf.keras.layers.BatchNormalization(),

    # Block 2
    tf.keras.layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.BatchNormalization(),

    # Block 3
    tf.keras.layers.Conv2D(256, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.Conv2D(256, (3, 3), activation='relu', padding='same'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.BatchNormalization(),

    # Classification head
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(num_classes, activation='softmax')
])
return model

CIFAR-10 CNN Architecture

def generate_model(): return tf.keras.models.Sequential([ tf.keras.layers.Conv2D(32, (3, 3), padding='same', input_shape=x_train.shape[1:]), tf.keras.layers.Activation('relu'), tf.keras.layers.Conv2D(32, (3, 3)), tf.keras.layers.Activation('relu'), tf.keras.layers.MaxPooling2D(pool_size=(2, 2)), tf.keras.layers.Dropout(0.25),

    tf.keras.layers.Conv2D(64, (3, 3), padding='same'),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Conv2D(64, (3, 3)),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
    tf.keras.layers.Dropout(0.25),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10),
    tf.keras.layers.Activation('softmax')
])

model = generate_model()

Custom Layers

Create reusable custom layers by subclassing tf.keras.layers.Layer .

Custom Dense Layer

import tensorflow as tf

class CustomDense(tf.keras.layers.Layer): def init(self, units=32, activation=None): super(CustomDense, self).init() self.units = units self.activation = tf.keras.activations.get(activation)

def build(self, input_shape):
    """Create layer weights."""
    self.w = self.add_weight(
        shape=(input_shape[-1], self.units),
        initializer='glorot_uniform',
        trainable=True,
        name='kernel'
    )
    self.b = self.add_weight(
        shape=(self.units,),
        initializer='zeros',
        trainable=True,
        name='bias'
    )

def call(self, inputs):
    """Forward pass."""
    output = tf.matmul(inputs, self.w) + self.b
    if self.activation is not None:
        output = self.activation(output)
    return output

def get_config(self):
    """Enable serialization."""
    config = super().get_config()
    config.update({
        'units': self.units,
        'activation': tf.keras.activations.serialize(self.activation)
    })
    return config

Use custom components

custom_model = tf.keras.Sequential([ CustomDense(64, activation='relu', input_shape=(10,)), CustomDense(32, activation='relu'), CustomDense(1, activation='sigmoid') ])

custom_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Residual Block

import tensorflow as tf

class ResidualBlock(tf.keras.layers.Layer): def init(self, filters, kernel_size=3): super(ResidualBlock, self).init() self.conv1 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same') self.bn1 = tf.keras.layers.BatchNormalization() self.conv2 = tf.keras.layers.Conv2D(filters, kernel_size, padding='same') self.bn2 = tf.keras.layers.BatchNormalization() self.activation = tf.keras.layers.Activation('relu') self.add = tf.keras.layers.Add()

def call(self, inputs, training=False):
    x = self.conv1(inputs)
    x = self.bn1(x, training=training)
    x = self.activation(x)
    x = self.conv2(x)
    x = self.bn2(x, training=training)
    x = self.add([x, inputs])  # Residual connection
    x = self.activation(x)
    return x

Custom Projection Layer with TF NumPy

class ProjectionLayer(tf.keras.layers.Layer): """Linear projection layer using TF NumPy."""

def __init__(self, units):
    super(ProjectionLayer, self).__init__()
    self._units = units

def build(self, input_shape):
    import tensorflow.experimental.numpy as tnp
    stddev = tnp.sqrt(self._units).astype(tnp.float32)
    initial_value = tnp.random.randn(input_shape[1], self._units).astype(
        tnp.float32) / stddev
    # Note that TF NumPy can interoperate with tf.Variable.
    self.w = tf.Variable(initial_value, trainable=True)

def call(self, inputs):
    import tensorflow.experimental.numpy as tnp
    return tnp.matmul(inputs, self.w)

Call with ndarray inputs

layer = ProjectionLayer(2) tnp_inputs = tnp.random.randn(2, 4).astype(tnp.float32) print("output:", layer(tnp_inputs))

Call with tf.Tensor inputs

tf_inputs = tf.random.uniform([2, 4]) print("\noutput: ", layer(tf_inputs))

Custom Models

Build complex architectures by subclassing tf.keras.Model .

Multi-Task Model

import tensorflow as tf

class MultiTaskModel(tf.keras.Model): def init(self, num_classes_task1=10, num_classes_task2=5): super(MultiTaskModel, self).init() # Shared layers self.conv1 = tf.keras.layers.Conv2D(32, 3, activation='relu') self.pool = tf.keras.layers.MaxPooling2D() self.flatten = tf.keras.layers.Flatten() self.shared_dense = tf.keras.layers.Dense(128, activation='relu')

    # Task-specific layers
    self.task1_dense = tf.keras.layers.Dense(64, activation='relu')
    self.task1_output = tf.keras.layers.Dense(num_classes_task1,
                                               activation='softmax', name='task1')

    self.task2_dense = tf.keras.layers.Dense(64, activation='relu')
    self.task2_output = tf.keras.layers.Dense(num_classes_task2,
                                               activation='softmax', name='task2')

def call(self, inputs, training=False):
    # Shared feature extraction
    x = self.conv1(inputs)
    x = self.pool(x)
    x = self.flatten(x)
    x = self.shared_dense(x)

    # Task 1 branch
    task1 = self.task1_dense(x)
    task1_output = self.task1_output(task1)

    # Task 2 branch
    task2 = self.task2_dense(x)
    task2_output = self.task2_output(task2)

    return task1_output, task2_output

Three-Layer Neural Network Module

class Model(tf.Module): """A three layer neural network."""

def __init__(self):
    self.layer1 = Dense(128)
    self.layer2 = Dense(32)
    self.layer3 = Dense(NUM_CLASSES, use_relu=False)

def __call__(self, inputs):
    x = self.layer1(inputs)
    x = self.layer2(x)
    return self.layer3(x)

@property
def params(self):
    return self.layer1.params + self.layer2.params + self.layer3.params

Recurrent Neural Networks

Custom GRU Cell

import tensorflow.experimental.numpy as tnp

class GRUCell: """Builds a traditional GRU cell with dense internal transformations.

Gated Recurrent Unit paper: https://arxiv.org/abs/1412.3555
"""

def __init__(self, n_units, forget_bias=0.0):
    self._n_units = n_units
    self._forget_bias = forget_bias
    self._built = False

def __call__(self, inputs):
    if not self._built:
        self.build(inputs)
    x, gru_state = inputs
    # Dense layer on the concatenation of x and h.
    y = tnp.dot(tnp.concatenate([x, gru_state], axis=-1), self.w1) + self.b1
    # Update and reset gates.
    u, r = tnp.split(tf.sigmoid(y), 2, axis=-1)
    # Candidate.
    c = tnp.dot(tnp.concatenate([x, r * gru_state], axis=-1), self.w2) + self.b2
    new_gru_state = u * gru_state + (1 - u) * tnp.tanh(c)
    return new_gru_state

def build(self, inputs):
    # State last dimension must be n_units.
    assert inputs[1].shape[-1] == self._n_units
    # The dense layer input is the input and half of the GRU state.
    dense_shape = inputs[0].shape[-1] + self._n_units
    self.w1 = tf.Variable(tnp.random.uniform(
        -0.01, 0.01, (dense_shape, 2 * self._n_units)).astype(tnp.float32))
    self.b1 = tf.Variable((tnp.random.randn(2 * self._n_units) * 1e-6 + self._forget_bias
               ).astype(tnp.float32))
    self.w2 = tf.Variable(tnp.random.uniform(
        -0.01, 0.01, (dense_shape, self._n_units)).astype(tnp.float32))
    self.b2 = tf.Variable((tnp.random.randn(self._n_units) * 1e-6).astype(tnp.float32))
    self._built = True

@property
def weights(self):
    return (self.w1, self.b1, self.w2, self.b2)

Custom Dense Layer Implementation

import tensorflow.experimental.numpy as tnp

class Dense: def init(self, n_units, activation=None): self._n_units = n_units self._activation = activation self._built = False

def __call__(self, inputs):
    if not self._built:
        self.build(inputs)
    y = tnp.dot(inputs, self.w) + self.b
    if self._activation != None:
        y = self._activation(y)
    return y

def build(self, inputs):
    shape_w = (inputs.shape[-1], self._n_units)
    lim = tnp.sqrt(6.0 / (shape_w[0] + shape_w[1]))
    self.w = tf.Variable(tnp.random.uniform(-lim, lim, shape_w).astype(tnp.float32))
    self.b = tf.Variable((tnp.random.randn(self._n_units) * 1e-6).astype(tnp.float32))
    self._built = True

@property
def weights(self):
    return (self.w, self.b)

Sequential RNN Model

class Model: def init(self, vocab_size, embedding_dim, rnn_units, forget_bias=0.0, stateful=False, activation=None): self._embedding = Embedding(vocab_size, embedding_dim) self._gru = GRU(rnn_units, forget_bias=forget_bias, stateful=stateful) self._dense = Dense(vocab_size, activation=activation) self._layers = [self._embedding, self._gru, self._dense] self._built = False

def __call__(self, inputs):
    if not self._built:
        self.build(inputs)
    xs = inputs
    for layer in self._layers:
        xs = layer(xs)
    return xs

def build(self, inputs):
    self._embedding.build(inputs)
    self._gru.build(tf.TensorSpec(inputs.shape + (self._embedding._embedding_dim,), tf.float32))
    self._dense.build(tf.TensorSpec(inputs.shape + (self._gru._cell._n_units,), tf.float32))
    self._built = True

@property
def weights(self):
    return [layer.weights for layer in self._layers]

@property
def state(self):
    return self._gru.state

def create_state(self, *args):
    self._gru.create_state(*args)

def reset_state(self, *args):
    self._gru.reset_state(*args)

Training Configuration

Model Parameters

Length of the vocabulary in chars

vocab_size = len(vocab)

The embedding dimension

embedding_dim = 256

Number of RNN units

rnn_units = 1024

Batch size

BATCH_SIZE = 64

Buffer size to shuffle the dataset

BUFFER_SIZE = 10000

Training Constants for MNIST

Size of each input image, 28 x 28 pixels

IMAGE_SIZE = 28 * 28

Number of distinct number labels, [0..9]

NUM_CLASSES = 10

Number of examples in each training batch (step)

TRAIN_BATCH_SIZE = 100

Number of training steps to run

TRAIN_STEPS = 1000

Loads MNIST dataset.

train, test = tf.keras.datasets.mnist.load_data() train_ds = tf.data.Dataset.from_tensor_slices(train).batch(TRAIN_BATCH_SIZE).repeat()

Casting from raw data to the required datatypes.

def cast(images, labels): images = tf.cast( tf.reshape(images, [-1, IMAGE_SIZE]), tf.float32) labels = tf.cast(labels, tf.int64) return (images, labels)

Post-Training Quantization

Load MNIST dataset

mnist = keras.datasets.mnist (train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Normalize the input image so that each pixel value is between 0 to 1.

train_images = train_images / 255.0 test_images = test_images / 255.0

Define the model architecture

model = keras.Sequential([ keras.layers.InputLayer(input_shape=(28, 28)), keras.layers.Reshape(target_shape=(28, 28, 1)), keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu), keras.layers.MaxPooling2D(pool_size=(2, 2)), keras.layers.Flatten(), keras.layers.Dense(10) ])

Train the digit classification model

model.compile(optimizer='adam', loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=['accuracy']) model.fit( train_images, train_labels, epochs=1, validation_data=(test_images, test_labels) )

When to Use This Skill

Use the tensorflow-neural-networks skill when you need to:

  • Build image classification models with CNNs

  • Create text processing models with RNNs or transformers

  • Implement custom layer architectures for specific use cases

  • Design multi-task learning models with shared representations

  • Train sequential models for tabular data

  • Implement residual connections or skip connections

  • Create embedding layers for discrete inputs

  • Build autoencoders or generative models

  • Fine-tune pre-trained models with custom heads

  • Implement attention mechanisms in custom architectures

  • Create time-series prediction models

  • Design reinforcement learning policy networks

  • Build siamese networks for similarity learning

  • Implement custom gradient computation in layers

  • Create models with dynamic architectures based on input

Best Practices

  • Use Keras Sequential for simple architectures - Start with Sequential API for linear layer stacks before moving to functional or subclassing APIs

  • Leverage pre-built layers - Use tf.keras.layers built-in implementations before creating custom layers

  • Initialize weights properly - Use appropriate initializers (glorot_uniform, he_normal) based on activation functions

  • Add batch normalization - Place BatchNormalization layers after Conv2D/Dense layers for training stability

  • Use dropout for regularization - Apply Dropout layers (0.2-0.5) to prevent overfitting in fully connected layers

  • Compile before training - Always call model.compile() with optimizer, loss, and metrics before fit()

  • Monitor validation metrics - Use validation_split or validation_data to track overfitting during training

  • Save model checkpoints - Implement ModelCheckpoint callback to save best models during training

  • Use model.summary() - Verify architecture and parameter counts before training

  • Implement early stopping - Add EarlyStopping callback to prevent unnecessary training iterations

  • Normalize input data - Scale pixel values to [0,1] or standardize features to mean=0, std=1

  • Use appropriate activation functions - ReLU for hidden layers, softmax for multi-class, sigmoid for binary

  • Set proper loss functions - sparse_categorical_crossentropy for integer labels, categorical_crossentropy for one-hot

  • Implement custom get_config() - Override get_config() in custom layers for model serialization

  • Use training parameter in call() - Pass training flag to enable/disable dropout and batch norm behavior

Common Pitfalls

  • Forgetting to normalize data - Unnormalized inputs cause slow convergence and poor performance

  • Wrong loss function for labels - Using categorical_crossentropy with integer labels causes errors

  • Missing input_shape - First layer needs input_shape parameter for model building

  • Overfitting on small datasets - Add dropout, augmentation, or reduce model capacity

  • Learning rate too high - Causes unstable training and loss divergence

  • Not shuffling training data - Leads to biased batch statistics and poor generalization

  • Batch size too small - Causes noisy gradients and slow training on large datasets

  • Too many parameters - Large models overfit and train slowly on limited data

  • Vanishing gradients in deep networks - Use residual connections or batch normalization

  • Not using validation data - Cannot detect overfitting or tune hyperparameters properly

  • Forgetting to set training=False - Dropout/BatchNorm behave incorrectly during inference

  • Incompatible layer dimensions - Output shape of one layer must match input of next

  • Not calling build() before weights - Custom layers need proper initialization before accessing weights

  • Using wrong optimizer - Adam works well generally, but SGD with momentum for some tasks

  • Ignoring class imbalance - Implement class weights or resampling for imbalanced datasets

Resources

  • TensorFlow Keras Guide

  • Building Custom Layers

  • Training and Evaluation

  • Sequential Model API

  • Model Subclassing Guide

  • RNN Guide

  • Custom Training Loops

  • Transfer Learning Guide

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

android-jetpack-compose

No summary provided by upstream source.

Repository SourceNeeds Review
General

fastapi-async-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
General

storybook-story-writing

No summary provided by upstream source.

Repository SourceNeeds Review
General

atomic-design-fundamentals

No summary provided by upstream source.

Repository SourceNeeds Review