Keras不适用于3个或更多的尺寸。 InvalidArgumentError

我想创建一个采用形状矩阵(64,4)来预测形状数组(4)的模型。但是由于某种原因它不起作用。例如,下面的代码:

import numpy as np
from tensorflow.keras import models, layers, optimizers

x = np.random.uniform(size=600*64*4).reshape(600, 64, 4)
y = np.random.uniform(size=600*4).reshape(600, 4)

model = models.Sequential([
    layers.Dense(16, activation='relu', input_shape=[64, 4]),
    layers.Dense(16, activation='relu'),
    layers.Dense(4)
])
model.compile(loss='mean_absolute_error',
              optimizer=optimizers.SGD(lr=1e-3, momentum=0.9),
              metrics=["mae"])

model.fit(x, y, epochs=5)

And this script ends with InvalidArgumentError: Incompatible shapes: [32,64,4] vs. [32,4]
But the code below:

import numpy as np
from tensorflow.keras import models, layers, optimizers

x = np.random.uniform(size=600*4).reshape(600, 4)
y = np.random.uniform(size=600*4).reshape(600, 4)

model = models.Sequential([
    layers.Dense(16, activation='relu', input_shape=[4]),
    layers.Dense(16, activation='relu'),
    layers.Dense(4)
])
model.compile(loss='mean_absolute_error',
              optimizer=optimizers.SGD(lr=1e-3, momentum=0.9),
              metrics=["mae"])
model.fit(x, y, epochs=5)

...效果很好。在我看来,这种行为存在逻辑上的错误。也许我听不懂。

请帮忙。

评论
  • 大四喜
    大四喜 回复

    以下作品,

    from tensorflow.keras.layers import *
    from tensorflow.keras.models import Model, Sequential
    import tensorflow as tf
    import numpy as np
    
    x = np.random.uniform(size=600*64*4).reshape(600, 64, 4)
    y = np.random.uniform(size=600*4).reshape(600, 4)
    
    ip = Input(shape=(64,4))
    d1 = Dense(16, activation='relu')(ip)
    f = Flatten()(d1)
    d2 = Dense(16, activation='relu')(f)
    d3 = Dense(4)(d2)
    
    model = Model(ip, d3)
    
    model.compile(loss='mse', metrics='mae', optimizer='adam')
    model.summary()
    
    model.fit(x,y,epochs=1, batch_size = 64)
    

    You're using Dense in a wrong manner, FC layers except for single-dimensional data mostly also you need to flatten in some layer so your last output is consistent with y.

    Model: "model_2"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_3 (InputLayer)         [(None, 64, 4)]           0         
    _________________________________________________________________
    dense_6 (Dense)              (None, 64, 16)            80        
    _________________________________________________________________
    flatten (Flatten)            (None, 1024)              0         
    _________________________________________________________________
    dense_7 (Dense)              (None, 16)                16400     
    _________________________________________________________________
    dense_8 (Dense)              (None, 4)                 68        
    =================================================================
    Total params: 16,548
    Trainable params: 16,548
    Non-trainable params: 0
    _________________________________________________________________
    10/10 [==============================] - 0s 3ms/step - loss: 0.1689 - mae: 0.3340
    
    <tensorflow.python.keras.callbacks.History at 0x7f838cd4deb8>
    

    检出Flatten层,确保模型输出为2-d(batch_size,num_class或output_nodes)。但是您无需展平就可以从模型中获得3维输出,因此您也必须使y 3维。

  • 魂淡
    魂淡 回复

    在第一个示例中,您具有2D功能(64x4)和4个1D输出。由于您的模型是一系列简单的矩阵乘法,因此您无需将2D数据转换为1D。您需要应用诸如RNN或CNN之类的层,以在数据的两个方向上对关系进行建模,然后对其进行整形,或者只是简单地对其进行整形。

    我建议将每个张量的形状标在纸上。您会很快发现,如果要按顺序获得每个点的乘积,则形状是不兼容的。

    EDIT: To make it clearer - run model.summary() before model.fit(). You'll see that the first Dense layer outputs [32, 64, 16] and the same for the second Dense. The last outputs [32, 64, 4] and to find the loss Tensorflow has to compare that tensor to the labels that you are providing. But they are shaped [32, 4]. You can't subtract a 3 by 8 matrix from a 2 by 5 matrix and likewise for tensors of different rank and dimensions (which is what you need to do here - your loss is literally a sum of (absolute value of) subtractions).